Query TIPS

Document Sample
Query TIPS Powered By Docstoc
					Q. Execute an Operating System Command From Within SQL Server (Completed 36)

A. The xp_cmdshell extended store procedure makes it possible to execute operating system commands
from within SQL Server.
Example:

EXEC Master..xp_cmdshell 'Dir c:\'

Q. How can I create a plain-text flat file from SQL Server as input to another application?

A. One of the purposes of Extensible Markup Language (XML) is to solve challenges like this, but until all
applications become XML-enabled, consider using our faithful standby, the bulk copy program (bcp) utility.
This utility can do more than just dump a table; bcp also can take its input from a view instead of from a
table. After you specify a view as the input source, you can limit the output to a subset of columns or to a
subset of rows by selecting appropriate filtering (WHERE and HAVING) clauses.

          More important, by using a view, you can export data from multiple joined tables. The only thing
you cannot do is specify the sequence in which the rows are written to the flat file, because a view does not
let you include an ORDER BY clause in it unless you also use the TOP keyword.

         If you want to generate the data in a particular sequence or if you cannot predict the content of the
data you want to export, be aware that in addition to a view, bcp also supports using an actual query. The
only "gotcha" about using a query instead of a table or view is that you must specify queryout in place of out
in the bcp command line.

For example, you can use bcp to generate from the pubs database a list of authors who reside in California
by writing the following code:

bcp "SELECT * FROM pubs..authors WHERE state = 'CA'" queryout c:\CAauthors.txt -c -T -S

Q. How can I programmatically detect whether a given connection is blocked?

A. A connection is blocked when it requires an object that another connection has a lock on. You can use
the system stored procedure sp_lock to retrieve information about the current locks in SQL Server, and you
can use the server process ID (SPID) to filter the information that sp_lock returns. To determine whether a
given process is waiting for the release of a locked resource, you can execute the sp_GetBlockInfo
procedure that follows.

Note: You must execute the procedure before the timeout.

USE master
GO

CREATE PROCEDURE sp_GetBlockInfo
@BlockedSPID as int
AS
IF EXISTS (select *
FROM master.dbo.syslockinfo
WHERE req_spid = @BlockedSPID
AND req_status = 3)

SELECT sli1.req_spid AS SPID,
SUBSTRING (u.name, 1, 8) As Mode,
DB_NAME(sli1.rsc_dbid) AS [Database],
OBJECT_NAME(sli1.rsc_objid) AS [Table],
sli1.rsc_Text AS [Resource]
FROM master.dbo.syslockinfo sli1
JOIN master.dbo.spt_values u
ON sli1.req_mode + 1 = u.number
AND u.type = 'L'
JOIN
master.dbo.syslockinfo sli2



                                                                                                            1
ON sli1.rsc_dbid = sli2.rsc_dbid
AND sli1.rsc_objid = sli2.rsc_objid
AND sli1.rsc_text = sli2.rsc_text
WHERE sli2.req_spid = @BlockedSPID
AND sli1.req_status = 1
AND sli1.req_spid <> @BlockedSPID
AND sli2.req_status = 3
ELSE

SELECT CAST(1 as int) AS SPID,
SUBSTRING ('', 1, 8) AS Mode,
DB_NAME(NULL) AS [Database],
OBJECT_NAME(NULL) AS [Table],
CAST(NULL AS nchar(32)) AS [Resource]
WHERE 1=2

GO

The sp_GetBlockInfo procedure tells you the lock mode, the database and object names of the locked
resource, and in the case of a blocking chain, which SPID is the root blocker. If the process is not blocked,
sp_GetBlockInfo returns an empty recordset.

         You can also detect blocks by checking for error 1222, "Lock request time out period exceeded."
The LOCK_TIMEOUT setting controls how long a process will wait for locks to be released before timing
out. When the lock timeout occurs, SQL Server sends error 1222 to the application. In SQL Server 7.0, this
error aborts the statement but does not cause the batch to roll back, so you can look for the Transact-SQL
system variable @@ERROR and determine where locks exist

Q. Can you create UNIQUE and PRIMARY KEY constraints on computed columns in SQL Server
2000?

A. In SQL Server, the physical mechanism that UNIQUE and PRIMARY KEY constraints use to enforce
uniqueness is a unique index. Because SQL Server 2000 supports indexes on computed columns,
you can create UNIQUE and PRIMARY KEY constraints on computed columns.

Defining a UNIQUE constraint on a computed column is a straightforward process, as the following example
shows:

CREATE TABLE T1 (
col1 int NOT NULL,
col2 AS col1 + 1 UNIQUE
)

However, if you define a PRIMARY KEY on a computed column, such as:

CREATE TABLE T2 (
col1 int NOT NULL,
col2 AS col1 + 1 PRIMARY KEY
)

you receive the following error:

Server: Msg 8111, Level 16, State 2, Line 1
Cannot define PRIMARY KEY constraint on nullable column in table 'T2'.
Server: Msg 1750, Level 16, State 1, Line 1
Could not create constraint. See previous errors.

Because of the primary key constraint, SQL Server requires you to guarantee that your computation's
result will not be NULL. The computation in the computed column can overflow
(for example, when you add 1 to the largest integer) or underflow (when you subtract 1 from the smallest
integer), and other computations can result in a divide-by-zero error. However, if the ARITHABORT
(which determines whether a query has ended when an overflow or a divide-by-zero error occurs)



                                                                                                           2
and ANSI_WARNINGS (which specifies ANSI SQL-92 standard behavior for several error conditions)
session settings are off, instead of ending the query, the computation can have a NULL result.

          In practice, when either ARITHABORT or ANSI_WARNINGS settings is off, you cannot create an
index on a computed column or insert values into a table that has an index on a computed column because
SQL Server detects such an attempt and returns an error. But SQL Server still requires you to guarantee
that the computation will not result in NULL values. The trick is to wrap the computed column's computation
with the ISNULL() function and supply an alternative value if the computation results in NULL:

CREATE TABLE T2 (
col1 int NOT NULL,
col2 AS ISNULL(col1 + 1, 0) PRIMARY KEY
)

Q. Why does my inline or embedded SQL run faster than my stored procedures?

A. Recompilations might be the source of the slower stored procedure speed. To find out for sure, you need
to do some performance investigation, such as looking at Showplans for each type of query versus calling
the stored procedures and comparing query plan cache hits to cache misses. You can also try coding the
object owner for referenced tables, views, and procedures inside your stored procedures, as the following
example shows:

SELECT * FROM dbo.mytable

This technique helps you reuse plans and prevent cache misses.

Q. Why doesn't SQL Server permit an ORDER BY clause in the definition of a view?

A. SQL Server excludes an ORDER BY clause from a view to comply with the ANSI SQL-92 standard.
Because analyzing the rationale for this standard requires a discussion of the underlying structure of the
structured query language (SQL) and the mathematics upon which it is based, we can't fully explain the
restriction here. However, if you need to be able to specify an ORDER BY clause in a view, consider using
the following workaround:

USE pubs
GO

CREATE VIEW AuthorsByName
AS
SELECT TOP 100 PERCENT *
FROM authors
ORDER BY au_lname, au_fname
GO

The TOP construct, which Microsoft introduced in SQL Server 7.0, is most useful when you combine it with
the ORDER BY clause. The only time that SQL Server supports an ORDER BY clause in a view is when it is
used in conjunction with the TOP keyword.

Note that the TOP keyword is a SQL Server extension to the ANSI SQL-92 standard.

Q. Is using the TOP N clause faster than using SET ROWCOUNT N to return a specific number of
rows from a query?

A. With proper indexes, the TOP N clause and SET ROWCOUNT N statement are equally fast, but with
unsorted input from a heap, TOP N is faster. With unsorted input, the TOP N operator uses a small internal
sorted temporary table in which it replaces only the last row. If the input is nearly sorted, the TOP N engine
must delete or insert the last row only a few times. Nearly sorted means you're dealing with a heap with
ordered inserts for the initial population and without many updates, deletes, forwarding pointers, and so on
afterward. A nearly sorted heap is more efficient to sort than sorting a huge table. In a test that used TOP N
to sort a table with the same number of rows but with unordered inserts, TOP N was not as efficient
anymore. Usually, the I/O time is the same both with an index and without; however, without an index SQL




                                                                                                                 3
Server must do a complete table scan. Processor time and elapsed time show the efficiency of the nearly
sorted heap. The I/O time is the same because SQL Server must read all the rows either way.

Q, The Difference between 'Count' and 'Count(*)'

A. 'Count': Counts the number of non-null values.
'Count(*)': Counts the number of rows in the table, including null values and duplicates.

Q. I have two tables t1 and t2 both with the columns a1, a2. I want to find the difference of (the set of
t1) - (the set of t2) without using the keyword EXCEPT because MSSQL 2000 does not recognize that
word. I have tried this query but it does not give me what I want: SELECT * FROM t1 WHERE NOT
EXISTS (SELECT t1.* FROM t1 INNER JOIN t2 ON t1.a1=t2.a1 AND t1.a2=t2.a2)

A.SELECT * FROM t1 WHERE NOT EXISTS (SELECT * FROM t2 ON t1.a1=t2.a1 AND t1.a2=t2.a2)

This is the solution. The difference of (the set of t1) - (the set of t2) is

SELECT * FROM t1 LEFT OUTER JOIN t2 ON t1.a1=t2.a1 AND t1.a2=t2.a2 WHERE t2.a1 IS NULL AND
t2.a2 IS NULL

According to SQL Query Analyzer, this is slightly more efficient than the left join (possibly only because
of the tables I tested it with):

SELECT * FROM t1 WHERE NOT EXISTS (SELECT * FROM t2 WHERE t1.a1=t2.a1 AND t1.a2=t2.a2)

Q, Leading Zero's in Stored Procedure

I have a form and I am passing a value to a stored procedure. The value has leading zeros. When the
values are passed the leading zeros are dropped, thus causing my stored procedure to blow up. Is
there a way to maintain those zeros in passing or pick them up again in the procedure?

Try to pass value as string.

If you were passing a string value then the LZ should not be dropped, so I suspect that you are passing a
numeric value then converting it back to a varchar which will drop any LZ. If you don't want to change the
interface then you can always restore the LZ in the stored procedure by using the following.

declare @lz varchar(10)
declare @numeric numeric(10,2)
declare @result varchar(20)
select @lz = '0000000000'
select @numeric = 0000123.45
-- the LZ will be lost upon assignment

select @result = substring(@lz, 1, datalength(@lz)-datalength(convert(varchar

Q, Method to Perform Case Sensitive Searches in SQL Server

A, By default, SQL Server 7.0 installation sets up SQL Server to run in a case insensitive mode. For most
applications this may work great, but there are certain situations were case sensitive searches are required.
For instance, if a web site needs to have passwords that are case sensitive a method needs to be devised to
perform case-sensitive queries. The following script performs a case-sensitive search. First let’s create a
table:


CREATE TABLE test
(
      id      INT NOT NULL,
      msg     VARCHAR(100) NOT NULL
)




                                                                                                             4
Now let’s insert some case sensitive data into it:

INSERT INTO test (id, msg) VALUES (1, 'bUSY');
INSERT INTO test (id, msg) VALUES (2, 'BUSY');
INSERT INTO test (id, msg) VALUES (3, 'busy');


In our test we are searching for a ‘bUSY’ value in the msg column of the test table. So the syntax of the
same query, if the SQL Server was set to be case sensitive, would be:

SELECT * FROM test where msg = ‘bUSY’;

This query will return all rows in the test table. Now, here is the script that will perform the case sensitive
search.

DECLARE @table VARCHAR( 30 )
DECLARE @col VARCHAR( 30 )
DECLARE @searchVal VARCHAR( 195 )

SET @table = 'test'
SET @col = 'msg'
SET @searchVal = 'bUSY'

DECLARE @sql VARCHAR( 255 )
DECLARE @colLength VARCHAR( 3 )

SELECT @colLength = CONVERT( varchar(3), DATALENGTH( @searchVal ) )

SELECT @sql = 'SELECT * ' +
' FROM ' + @table +
' WHERE' +
' CONVERT( varbinary( ' + @colLength + '), ' + @col + ') = ' +
' CONVERT( varbinary( ' + @colLength + '), "' + @searchVal + '")' +
' AND ' + @col + '="' + @searchVal + '"'

EXEC( @sql )


Listing Available Tables

Q, How do I list the available tables in a database I'm querying?

A, You can get a list of tables in a database by calling the INFORMATION_SCHEMA.Tables view like this:

SELECT * FROM information_schema.tables

This returns the name of the current database, the owner, the table name, and the table type for each table
in the database. It's possible to query the system tables directly, but if this gives the information you need,
it's better to use the existing views that come with SQL Server.

Question: How do I count the number of duplicate items in a table?

Answer: Lets break down your question into several steps. First, let's create a sample table using the
following code:

create table dups
(
         i int
)
go

declare @i int



                                                                                                                  5
@i = 0

while (@i < 35)
begin
  insert into dups(i) values (cast (rand() * 50 as int))
          select @i = @i + 1
end

Now, let's find rows that are duplicates. For that we can use a simple group by statement:

select i,count(*) as num_records from dups group by i having count(*) > 1

My sample data produced the following result set:

i         num_records
----------- -----------
0          2
5          2
18          2
22          2
27          2
31          2
34          2
44          2
49          2

This identifies the rows that have duplicates. But it does not return the total number of duplicates in the
table. The first change we must make is to recognize the above rows that show 2 contain only one duplicate.
So we want a query that basically sums up the duplicates from the above query. To do so, we take the
previous query and can put that in the from statement as a derived table. We then can use the sum function
to create the total for us:

select sum(num_dups)
from (select i,count(*)-1 as num_dups
from dups
group by i
having count(*)-1 > 0) as mydups

Using Distributed Queries on SQL Server 7.0

SQL Sever 7 allows developers to execute commands against OLE DB data sources on different servers. In
order to execute commands on remote servers, the SQL Server instance where the commands will be
issued must be setup properly. This entails adding the remote server to SQL Server's linked server list. Do
this by using the sp_addlinkedserver command.

For example, to link a remote SQL Server database that resides on the RemoteDBServer server, you would
usethe following syntax:

EXEC master. . sp_addlinkedserver
@server = 'RemoteDBServer',
@provider = 'SQLOLEDB',
@srvproduct = 'SQL Server',
@datasrc = 'RemoteDBServer'

Note that only members of the sysadmin role can set this server option. Once the remote database has
been linked, queries can be executed against it as long as the remote tables are prefaced using
Server.Database.Table Owner.Table Name.

For example, the following query would return all rows in the authors table of our RemoteDBServer SQL
Server database:




                                                                                                          6
SELECT * FROM RemoteDBServer.pubs.dbo.authors

Dynamic Query with .... in (@var)

Question: Can I use a variable in a query with the IN clause (a,b,c..z), without getting quotes or
conversion errors?

Answer: You can use a variable as your IN clause, but this requires that you use the EXEC function to run
the statement.

Delete Files from SQL Server Local Machine

Ever wanted to delete files from the local machine that your SQL Server database is running?
You can do it using the extended stored procedure xp_cmdshell like this:

EXEC master..xp_cmdshell 'del C:\file.txt'

But this requires the sysadmin option on the SQL Server and Admin role from NT server. In most instances
it is not preferable to give these privileges. So to delete files without requiring this access use the built-in
SQL Server Automation APIs and the FileSystemObject:

DECLARE           @hr                          int
DECLARE           @ole_FileSystem              int

EXEC @hr = sp_OACreate 'Scripting.FileSystemObject', @ole_FileSystem OUT

EXEC @hr = sp_OAMethod @ole_FileSystem, 'DeleteFile', NULL, 'C:\file.txt'

EXEC @hr = sp_OADestroy @ole_FileSystem

Large Text Fields

Question: How do I create a text field of greater than 8,000 characters (in v7.0)? I attempted to use
the "text" data type, but my code returned an error saying the maximum size was 8,000. Here's the
code:

CREATE TABLE X ( X_ID int IDENTITY(1,1), X_DESC text (60000) NOT NULL ) GO

Answer: SQL Server is returning a bogus error message. The real error has to do with your syntax.
When specifying text you don't specify a size.

You can see the real error message if you reduce the number 60000 to 5. Then you will get this message:
Server: Msg 2716, Level 16, State 1, Line 1

Column or parameter #2: Cannot specify a column width on data type text. Instead, simply specify it as text
without the parentheses and the number. The actual size of the storage used for the text field will depend on
how much data you actually put in the column

Importing Excel into SQL without Using Wizard

Question: How do I give a user the option of importing Excel and a delimited text file into a SQL
Server Database without manually using SQL DTS?

Answer: You can use the DTS object model to programmatically create, modify and run DTS packages.
You can do this many ways, but essentially this object model has objects for anything you can do in DTS.
If you already have the package created and saved as a file, add a reference to the DTS Library.
Then you can call it like this:

  Dim Package As New DTS.Package

  Package.LoadFromStorageFile App.Path "\DTSPackage.dts"
  Package.Execute



                                                                                                                   7
Connecting to SQL Server 7 via MS-DOS 6.2

Question: I have some complex engine control software that has to run under MS-DOS 6.2, but it
needs to transfer data to SQL Server 7. What's the best way to connect to SQL Server from DOS?

Answer: You can use OSQL or ISQL to connect to SQL Server from the command line. Both of these utilities
are Win32 command-line utilities that allow you to log in to a server and issue any SQL command. The
OSQL utility supports the most functionality and uses ODBC to connect to SQL Server.

The syntax "osql/?" provides a listing of all OSQL flags.

This syntax connects to a server and issues a query:

osql -S "servername" -U "sa" -d "pubs" -q "SELECT * FROM Authors"

To insert data into a database, you could put OSQL commands in a batch file or provide an input file of
 SQL INSERT commands. See the SQL Server BOL for details and syntax of OSQL. If you have a straight
DOS box, you need to install the 16-bit client that comes with SQL Server 6.5. Both the client and the 6.5
version of ISQL use DB-Lib, which doesn't support some SQL Server 7 features like unicode columns,
varchar columns larger than 255, etc.

Passing a CSV List Within an IN Statement

Question: I have a CSV list I am passing through as a parameter in a stored procedure.

Example of Region list: 1,2,3,4,5,6,7,8,9,10
CREATE PROCEDURE sp_getReport
          @Regionlist varchar(100) AS
Select * from tblRegion where Region_ID IN (@regionlist)

The varchar treats the list as one block of data. How can I change my code/syntax to read each value of the
list separately within the IN statement?

Answer: When you pass variables that are part of the statement in a stored procedure, you need to use
the EXEC(UTE) statement. Here's the syntax to call the stored procedure and modify the current stored
procedure to take the regions list as a parameter (the only difference is I called the column RegionID instead
of Region_ID):

sp_getreport '1,2,3,4,5,6,7,8,9,10'

CREATE PROCEDURE sp_getReport
@regionlist varchar(100)
AS

EXEC ('Select * from tblRegion where RegionID IN (' + @regionlist +')')

Creating Tables with Variable Names

Question: Is it possible to create tables in stored procedures using a variable for the table name?
For instance:

declare @MyName
set @MyName = 'dbo.Unique_Name'
 create table @MyName ....
This obviously doesn't work, but does this ability exist?

Answer: Yes, you can issue DLL statements with variables; you just need to use the EXECUTE statement.
For example, this statement uses a variable for the table name, as in your example:

DECLARE @TableName varchar(255)
SET @TableName = '[dbo].[TestTable]'



                                                                                                             8
EXEC ('CREATE TABLE ' + @TableName + ' (
        [ID] [int] NULL ,
        [ItemDesc] [char] (50) NULL
)')


Use Sysobjects in SQL Server to Find Useful Database Information

SQL Server sysobjects Table contains one row for each object created within a database. In other words,
it has a row for every constraint, default, log, rule, stored procedure, and so on in the database.
Therefore, this table can be used to retrieve information about the database. We can use xtype column in
sysobjects table to get useful database information. This column specifies the type for the row entry in
sysobjects.

For example, you can find all the user tables in a database by using this query:

select * from sysobjects where xtype='U'

Similarly, you can find all the stored procedures in a database by using this query:

select * from sysobjects where xtype='P'

This is the list of all possible values for this column (xtype):

C = CHECK constraint
D = Default or DEFAULT constraint
F = FOREIGN KEY constraint
L = Log
P = Stored procedure
PK = PRIMARY KEY constraint (type is K)
RF = Replication filter stored procedure
S = System table
TR = Trigger
U = User table
UQ = UNIQUE constraint (type is K)
V = View
X = Extended stored procedure

Query to a Text File

Question: I would like to create a stored procedure that runs a query and outputs the results to a text
file and allows me to add extra delimeters and static field info.

Answer: SQL Server has no native command for outputting query results to a file. You can use the extended
stored procedure xp_cmdshell and call isql (command-line SQL) with your query and output the results to a
file. Any delimiters would need to be part of the SELECT string:

DECLARE @isqlString varchar(255)

SELECT @isqlString = 'isql -Q "SELECT DateCol FROM NorthwindTest.dbo.Test" -E -o C:\Results.txt'

EXEC master..xp_cmdshell @isqlString

Dynamic Variables and Stored Procedures

Question: I need to pass the @TheVar variable to the stored procedure's IN statement but I keep
getting zero records. If I type this statement into the SQL... WHERE Name IN ('John', 'Frank', 'Tom')
the statement works fine; 20 records are returned.

EXECUTE the_proc "'John', 'Frank', 'Tom'"




                                                                                                        9
CREATE PROCEDURE the_proc
@TheVar nvarchar(40)
AS SELECT COUNT(FieldName) FROM Clients WHERE Name IN (@TheVar)
Do you have any ideas? I use MS SQL Server 7.0 SP2.

Answer: If you want to use variables as part of your SQL statement you need to modify it to call the
EXECUTE statement. I created a Clients table with a single Name field of varchar(255) with records for
John, Frank, and Tom. Then I created a stored procedure (similar to yours) that uses the input variable to
build the IN clause like this:

CREATE PROCEDURE GetClients
        @TheVar varchar(255)
AS
EXEC ('SELECT COUNT(Name)
        FROM Clients
        WHERE Name IN (' + @TheVar + ')')

As you can see, it builds the SELECT statement using the variable and then the EXEC statement runs it.
The syntax to call this procedure is:

GetClients "'John', 'Tom', 'Frank'"
---
3

Question: If my SQL Server has the following specs: 6.05.02 SQL-DMO 6.50.252 DB-Library Which
version do I have? And which service pack version do I have?

Answer: If you want to query the version of SQL Server that's currently running you can use the @@version
variable:

SELECT @@version

This returns the version, processor, build and service pack information for the currently installed SQL
Server. This information is stored in the system tables, and you can retrieve more details by calling the
extended stored procedure xp_msver. Be sure to call it from the master database.

Data Type

Question: How do I compare two values when one value(data type) is char and the other is an
integer?

Answer: For this you can use the CONVERT function. Since you can't convert character data to numbers,
you'll have to convert the integer column to character. Assuming you have a table called TestTable with two
columns:

IntCol int
CharCol char(10)

with these values

IntCol CharCol
------ -------
1       1
2       2
3       xyz
4       5

You can use this query to SELECT rows based on a comparison:

SELECT * FROM TestTable WHERE CONVERT(char(10), IntCol) = CharCol

IntCol    CharCol



                                                                                                             10
----------- ----------
1          1
2          2


E-mail in a Query

Question: How do I send an e-mail to myself if a field reads "SEND_MAIL?"
For example, a user wants more info on a product, so he clicks on the Send Mail button, which in turn
throws an entry into the database. This entry is SEND_MAIL. Now at the end of the day I would like the
server to scan the database for the word SEND_MAIL and, when it finds one, send me an email with the
name and email address of the person.

Answer: To answer this question I took the scenario of a Web site that logs user registrations and puts a
visitor's first name, last name and e-mail address in a table. I then created a stored procedure that selects
this data and e-mails it to an e-mail address provided as an input parameter. If the SendMail column in the
table is Y, then the details will be e-mailed. The stored procedure is called like this:

usp_CheckForMail 'name@emailaddress.com'

The script to create the table is as follows:

if exists (select * from sysobjects where id = object_id(N'[dbo].[TestTable]') and OBJECTPROPERTY(id,
N'IsUserTable') = 1)
drop table [dbo].[TestTable]
GO

CREATE TABLE [dbo].[TestTable] (
        [ID] [int] IDENTITY (1, 1) NOT NULL ,
        [Fname] [varchar] (50) NULL ,
        [Lname] [varchar] (50) NULL ,
        [EmailAddress] [varchar] (100) NULL ,
        [SendMail] [char] (1) NULL
) ON [PRIMARY]
GO

The script to create the stored procedure to send the e-mails is:

if exists (select * from sysobjects where id = object_id(N'[dbo].[usp_CheckForMail]') and
OBJECTPROPERTY(id, N'IsProcedure') = 1)
drop procedure [dbo].[usp_CheckForMail]
GO

SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON
GO

CREATE PROCEDURE usp_CheckForMail
   @SendToEmail varchar(255)
AS
      DECLARE @ID int, @Fname varchar(50), @Lname varchar(50), @EmailAddress varchar(100),
@MsgText varchar(255)
      DECLARE WebVisitors CURSOR FOR
               SELECT ID, Lname, Fname, EmailAddress FROM TestTable

                  WHERE SendMail = 'Y'
                  ORDER BY Lname, Fname
            OPEN WebVisitors

            FETCH NEXT FROM WebVisitors
                   INTO @ID, @Lname, @Fname, @EmailAddress




                                                                                                          11
         EXEC master.dbo.xp_startmail /* typically mail will be started on your server */

         -- Check @@FETCH_STATUS to see if there are any more rows to fetch.
         WHILE @@FETCH_STATUS = 0
         BEGIN

         -- PRINT 'Visitor: ' + @Fname + ' ' + @Lname + ' at ' + @EmailAddress + ' [ID=' + CAST(@ID AS
varchar) + ']'

        SET @MsgText = 'Visitor: ' + @Fname + ' ' + @Lname + ' at ' + @EmailAddress + ' [ID=' +
CAST(@ID AS varchar) + ']'
        EXEC master.dbo.xp_sendmail @recipients = @SendToEmail, @message =@MsgText, @subject
= 'New Web Visitor'

      FETCH NEXT FROM WebVisitors
      INTO @ID, @Lname, @Fname, @EmailAddress

END

CLOSE WebVisitors
DEALLOCATE WebVisitors

GO
         SET QUOTED_IDENTIFIER OFF
         SET ANSI_NULLS ON
GO

Pros and Cons of Extended Stored Procedures

Question: What are the pros and cons of creating extended stored procedures?

Answer: The pros are that you can implement additional functionality and access data from DLLs from within
SQL Server. If you need to do something that can be done only in C or C++, or if you have data that can be
accessed only outside of SQL Server, you can still provide a link to it. The biggest con to extended stored
procedures is that they run in the same process space as SQL Server. So an errant DLL could overwrite
memory and cause SQL Server to crash or even corrupt data. The biggest safeguard against these
problems is thorough testing of the procedure

Percent Sign in SQL

Question: I have a table in which the key field has a value stored with a percent sign, like '1234%'.
Using this value, I want to select from another table that can have values like '1234567', '1234678'
and '1234098'. How do I go about it?

Answer: The percent sign (%) is a wildcard in SQL Server. It can be used at the beginning or end of a string.
So the following syntax will return all of the records you mentioned:

SELECT * FROM TestTable WHERE Col LIKE '1234%'

Col
-------
1234567
1234678
1234098
If you want to do an exact match for '1234' without the percent sign, then you'll have to trim off the last
character, like this:
SELECT * FROM TestTable WHERE Col LIKE LEFT('1234%', (LEN('1234%')-1))

Delete Duplicate Rows with a Single SQL Statement

Question: Is it possible to delete duplicate rows in a table without using a temporary table (i.e., just
do it with a single SQL statement)?



                                                                                                              12
Answer: All you need to do is compare the table to itself to find out which candidates are duplicates. Do this
by assigning aliases to the table so you can use it twice, once as A and again as B, like this:

delete
  from jobs
where
  job_desc in
  (
     select
          a.job_desc
     from
        jobs a,
        jobs b
     where
        a.job_desc = b.job_desc
     group by
        a.job_desc
     having
         count(a.job_desc) >1
  )

When you do this you'll get a count based on the column value you think is duplicated. I used "desc"
because the IDs will be different, so the description is the thing that is the candidate for repetition. Join the
table to itself on that candidate to find matches of it. Everything will match to itself at least once that's why
you group by the thing you think is a duplicate. Applying the HAVING clause to it squeezes out all the "ones"
or singletons, leaving only the rows that have counts that are more than one in other words, your duplicate
rows.

By the way, this code trashes all the records that are duplicates. If you want to save one, add a comparison
for the IDs to be different in the WHERE clause.

Joining Queries from Oracle and SQL Server Databases

Question: I need a query that retrieves info from an Oracle table and a query that retrieves info from
a SQL Server table. The info has to be joined together according to Record ID numbers. I have very
limited access to the Oracle database but full control of the SQL Server database.How do I join two
different queries from two different databases?

Answer: To query to different data sources, you can make the Oracle server a linked server to the SQL
Server server. A linked server can be any OLE DB data source and SQL Server currently supports the OLE
DB data provider for Oracle. You can add a linked server by calling sp_AddLinkedServer and query
information about linked servers with sp_LinkedServers.

        An easier way to add a linked server is to use Enterprise Manager. Add the server through the
Linked Servers icon in the Security node. Once a server is linked, you can query it using a distributed query
(you have to specify the full name).

Here's an example of a distributed query (from the SQL Server Books Online) that queries the Employees
table in SQL Server and the Orders table from Oracle:

SELECT emp.EmloyeeID, ord.OrderID, ord.Discount
FROM SQLServer1.Northwind.dbo.Employees AS emp,
OracleSvr.Catalog1.SchemaX.Orders AS ord
WHERE ord.EmployeeID = emp.EmployeeID
AND ord.Discount > 0

Use the OpenRowSet Function to Run a Query on a Remote SQL Server

You can use the OPENROWSET( ) function to run a query on a remote SQL server by using the following
syntax:




                                                                                                              13
SELECT *FROM OPENROWSET('SQLOLEDB', 'remote_server_name'; 'sa'; 'password','SQL statement')
Here replace 'remote_server_name' with the name of the remote server on which you want to run the query.
If necessary, replace 'sa' and 'password' with the name and password of a SQL login ID you want to use to
log in to the remote server. Finally, replace 'SQL statement' with the SQL statement you want to run on the
remote server

Check Whether a Global Temporary Exists in a SQL Database

Checking whether a table exists in a Microsoft SQL Server database is easy. You can use this query:

SELECT 'x'
FROM sysobjects
WHERE type = 'U' and NAME = 'mytable'
But this query will not work while searching for global temporary tables. Global temporary tables are stored
in tempdb.
Use this syntax for the search:

DECLARE @temp_table VARCHAR(100)
SET @temp_table = '##my_temp_table'
IF NOT EXISTS (SELECT 'x'
      FROM tempdb..sysobjects
      WHERE type = 'U' and NAME = @temp_table)
   PRINT 'temp table ' + @temp_table + ' does not exist'

ELSE
  PRINT 'temp table ' + @temp_table + ' exists.'
Note: You cannot search for local temporary tables (# prefix tables) in this way. This is because SQL Server
appends a unique number to the name you supply. For example, if you specified "#temp," the name in
sysobjects would be something like "#temp____1234."

See Who Is Blocking Your SQL Server

If you have ever monitored any blocking problems in SQL Server, you know that sp_who only shows you the
spid (SQL Server's internal Process ID) that is causing the blocking for each spid that is blocked. Often a
blocked spid is shown as causing blocking for another spid. To see the spid (or spids) that started the whole
mess off, execute the following SQL:

SELECT p.spid
,convert(char(12), d.name) db_name
, program_name
, convert(char(12), l.name) login_name
, convert(char(12), hostname) hostname
, cmd
, p.status
, p.blocked
, login_time
, last_batch
, p.spid
FROM        master..sysprocesses p
JOIN      master..sysdatabases d ON p.dbid = d.dbid
JOIN      master..syslogins l ON p.suid = l.suid
WHERE p.blocked = 0
AND        EXISTS ( SELECT 1
       FROM      master..sysprocesses p2
       WHERE p2.blocked = p.spid )

We built this into our own version of sp_who, called sp_hywho. See the listing below. Code for sp_hywho:

if exists (select * from sysobjects
   where id = object_id('dbo.sp_hywho')
   and sysstat & 0xf = 4)
   drop procedure dbo.sp_hywho



                                                                                                           14
GO

Create Procedure sp_hywho
( @vcDBName sysname = NULL )
AS

SET NOCOUNT ON

IF EXISTS ( SELECT 1
  FROM master..sysprocesses p
  WHERE p.blocked = 0
  AND EXISTS ( SELECT 1
     FROM master..sysprocesses p2
     WHERE p2.blocked = p.spid ) )
BEGIN
  PRINT "Blocking caused by:"
  PRINT ""
  SELECT p.spid
  ,convert(char(12), d.name) db_name
  , program_name
  , convert(char(12), l.name) login_name
  , convert(char(12), hostname) hostname
  , cmd
  , p.status
  , p.blocked
  , login_time
  , last_batch
  , p.spid
  FROM master..sysprocesses p
  JOIN master..sysdatabases d ON p.dbid = d.dbid
  JOIN master..syslogins l ON p.suid = l.suid
  WHERE p.blocked = 0
  AND EXISTS ( SELECT 1
     FROM master..sysprocesses p2
     WHERE p2.blocked = p.spid )
  AND (p.dbid = DB_ID( @vcDBName ) OR @vcDBName IS NULL)
  ORDER BY 2,IsNull(Ltrim(program_name),"ZZZZZZZZZ"),4,5
  PRINT ""
END

SELECT p.spid
,convert(char(12), d.name) db_name
, program_name
, convert(char(12), l.name) login_name
, convert(char(12), hostname) hostname
, cmd
, p.status
, p.blocked
, login_time
, last_batch
, p.spid
FROM master..sysprocesses p
JOIN master..sysdatabases d ON p.dbid = d.dbid
JOIN master..syslogins l ON p.suid = l.suid
WHERE (p.dbid = DB_ID( @vcDBName ) OR @vcDBName IS NULL)
ORDER BY 2,IsNull(Ltrim(program_name),"ZZZZZZZZZ"),4,5


SET NOCOUNT OFF

GO




                                                           15
if exists (select * from sysobjects
   where id = object_id('dbo.sp_hywho')
   and sysstat & 0xf = 4)
   GRANT EXEC ON dbo.sp_hywho TO PUBLIC
GO


Getting the last identity value used

Many times, you'll want to know the last identity (key) value that was used in an insert. The biggest reason
for this is so that the same value can be reused when inserting a foreign key. This is done differently
between SQL Server and DB2. In DB2, the identity values can be picked up by the application and reused
using and Identity_Val_function(), which returns the most recently assigned value for an identity column. In
SQL Server, the last identity value used in an insert can be retrieved with the @@identity function.

Except Operator

To find rows in one set that do not exist in another set, use the except operator (as defined in SQL-92 and
SQL-99). For example, here's how you find column1 from Table1 that does not exist in column2 of Table2:
Select column1 from Table1 Except Select column2 from Table2; The except operator will remove
duplicates, and a single null value will be returned in the case of multiple null values. To return duplicates,
use except all. Keep in mind, of course, that other proprietary implementations (such as Minus in Oracle)
exist.

Understanding the different SQL Server 7.0 versions

To date, Microsoft has released six different versions of SQL Server 7.0. These versions include the
Desktop, Standard, and Enterprise Editions, as well as the Developer, Microsoft Developer (MSDE), and
Small Business Server Editions. While all of these versions are SQL Server 7.0, there are some key
differences. First of all, you can run the Desktop Edition on Windows NT Workstation 4.0, Windows NT
Server 4.0, and Windows 9x. The Desktop Edition doesn't support the Microsoft Search Service, OLAP
Services, parallel queries, or transaction replication--and it can't be bought on its own. Instead, you must buy
either the Standard or the Enterprise Edition to get the Desktop Edition. You can run the Standard Edition of
SQL Server only on Windows NT Server 4.0 (or later). This version does support such features as the
Microsoft Search Service, OLAP Services, parallel queries, and transactional replication. It also supports up
to 4 CPUs and 2 GB of RAM. In contrast, the Enterprise Edition runs only on Windows NT Server 4.0
Enterprise Edition and supports Microsoft Cluster Server. It also supports all of the features supported in the
Standard Edition--plus up to 32 CPUs and more than 2 GB of RAM. The Developer Edition of SQL Server is
included with Visual Studio for developer use. This version supports a limited number of connections but
does include debugging tools. The Microsoft Developer Edition (MSDE) of SQL Server is simply a run-time
engine that's included as part of Microsoft Office 2000. Although the MSDE version includes some of the
management utilities, it doesn't include all of them. The MSDE was designed for you to distribute as part of
an application, not as a stand-alone product. Finally, the Small Business Server Edition is part of Microsoft's
Small Business Server. This version of SQL Server is essentially the same as the Standard Edition but
comes hard-coded with a limit of 100 users and a maximum database size of 10 GB.

Display Amount of Disk Activity Generated by Transact-SQL Statements

You can set SQL Server to display information regarding the amount of disk activity generated by T-SQL
statements. This option displays the number of scans, the number of logical reads (pages accessed), and
the number of physical reads (disk accesses) for each table referenced in the statement. This option also
displays the number of pages written for each statement. When STATISTICS IO is ON, statistical
information is displayed. When OFF, the information is not displayed. After this option is set ON, all
subsequent T-SQL statements return the statistical information until the option is set to OFF.
Here is the syntax:

SET STATISTICS IO {ON | OFF}




                                                                                                             16
Derived Tables

Use 'Derived tables' wherever possible, as they perform better. Consider the following query to find the
second highest salary from the Employees table:

SELECT MIN(Salary)
FROM Employees
WHERE EmpID IN
(
SELECT TOP 2 EmpID
FROM Employees
ORDER BY Salary Desc
)

The same query can be re-written using a derived table, as shown below, and it performs twice as fast as
the above query:

SELECT MIN(Salary)
FROM
(
SELECT TOP 2 Salary
FROM Employees
ORDER BY Salary DESC
) AS A

This is just an example, and your results might differ in different scenarios depending on the database
design, indexes, volume of data, etc. So, test all the possible ways a query could be written and go with the
most efficient one.

Benefits of Derived Tables

The biggest benefit of using derived tables over using temporary tables is that they require fewer steps, and
everything happens in memory instead of a combination of memory and disk. The fewer the steps involved,
along with less I/O, the faster the performance.
Here are the steps when you use a temporary table:
    1) Lock tempdb database
    2) CREATE the temporary table (write activity)
    3) SELECT data & INSERT data (read & write activity)
    4) SELECT data from temporary table and permanent table(s) (read activity)
    5) DROP TABLE (write activity)
    4) Release the locks

Compare the above to the number of steps it takes for a derived table:
   1) CREATE locks, unless isolation level of "read uncommitted" is used
   2) SELECT data (read activity)
   3) Release the locks

Using derived tables instead of temporary tables reduces disk I/O and can boost performance. Now let's see
how.

Q, How to Run a Query on a Remote SQL Server


A, Use the OpenRowSet Function to Run a Query on a Remote SQL Server
You can use the OPENROWSET( ) function to run a query on a remote SQL server by using the following
syntax:

SELECT * FROM OPENROWSET('SQLOLEDB', 'remote_server_name'; 'sa'; 'password','SQL statement')

Here replace 'remote_server_name' with the name of the remote server on which you want to run the query.
If necessary, replace 'sa' and 'password' with the name and password of a SQL login ID you want to use to




                                                                                                           17
log in to the remote server. Finally, replace 'SQL statement' with the SQL statement you want to run on the
remote server

Q, Why does my inline or embedded SQL run faster than my stored procedures?

A, Recompilations might be the source of the slower stored procedure speed. To find out for sure,
you need to do some performance investigation, such as looking at Showplans for each type of
query versus calling the stored procedures and comparing query plan cache hits to cache misses.
You can also try coding the object owner for referenced tables, views, and procedures inside your
stored procedures, as the following example shows:
SELECT * FROM dbo.mytable
This technique helps you reuse plans and prevent cache misses.

Q, How Can I Generate a Series of Random Integers With T-SQL?

Here is a stored procedure that will generate a series of random numbers and return them in a result set.
This SQL Server stored procedure returns a result set that containing a series of random integers.

Inputs are:
@begin_no - lowest number to be generated,
@end_no - highest number to be generated

Calling syntax:
Exec sp_GenerateRandomNumberResultSet Begin, End

CREATE procedure sp_GenerateRandomNumberResultSet
@begin_no int=1, @end_no int=100 As

Set nocount on

Declare @rn int, @rec int, @cnt int
Set @cnt=@end_no-@begin_no+1

Create Table #t (RecNo int identity(1,1), RandNo int null)

Set @rec=1
While @rec<=@cnt
Begin
Insert Into #t (RandNo) Values (null)
Set @rec=@rec+1
End

Set @rec=1
While @rec<=@cnt
Begin
Set @rn=rand()*@cnt+@begin_no
If @begin_no<0 Set @rn=@rn-1
Update #t Set RandNo=@rn
Where RecNo=@rec
And @rn Not In
(Select RandNo
From #t Where RandNo Is Not Null)
If @@rowcount>0 Set @rec=@rec+1
End

Set nocount off

Select RandNo From #t

Drop Table #t
Go



                                                                                                            18
Transact-SQL SQL Server Performance Tuning Tips
1.   Slow queries can be caused by a wide variety of reasons. Some include:
        Lack of useful indexes
        Lack of useful I/O striping
        Out-of-date statistics or lack of useful statistics
        Lack of physical memory
        Slow network connection
        Transact-SQL queries transferring large amounts of data from the server to the client
        Blocked locks or deadlocks
        Performing OLTP and OLAP queries on the same physical server
        Poorly designed query
     Any one or more of the above items can cause slow query performance. Each of these problem areas,
     and more, are discussed in detail on this page and other pages on this website. [6.5, 7.0, 2000]
     Updated 9-22-2000

2.   This tip may sound obvious to most of you, but I have seen professional developers, in two major SQL
     Server-based applications used worldwide, not follow it. And that is to always include a WHERE
     clause in your SELECT statement to narrow the number of rows returned. If you don't use a WHERE
     clause, then SQL Server will perform a table scan of your table and return all of the rows. In some case
     you may want to return all rows, and not using a WHERE clause is appropriate in this case. But if you
     don't need all the rows returned, use a WHERE clause to limit the number of rows returned. [6.5, 7.0,
     2000] Added 12-8-2000

3.   Don't be afraid to make liberal use of in-line and block comments in your Transact-SQL code, they
     will not affect the performance of your application. [6.5, 7.0, 2000] Added 10-4-2000

4.   If possible, avoid using SQL Server cursors. They generally use a lot of SQL Server resources and
     reduce the performance and scalability of your applications. If you need to perform row-by-row
     operations, try to find another method to perform the task. Some options are to perform the task at the
     client, use tempdb tables, use derived tables, use a correlated sub-query, or use the CASE statement.
     More often than not, there are non-cursor techniques that can be used to perform the same tasks as a
     SQL Server cursor. [6.5, 7.0, 2000] Updated 5-18-2001

5.   To help identify long running queries, use the SQL Server Profiler Create Trace Wizard to run the
     "TSQL By Duration" trace. You can specify the length of the long running queries you are trying to
     identify (such as over 1000 milliseconds), and then have these recorded in a log for you to investigate
     later. [7.0, 2000] More info from Microsoft

6.   When using the UNION statement, keep in mind that by default it performs the equivalent of a
     SELECT DISTINCT on the results set. If you know that there will not be any duplication of rows created
     as a result of using the UNION statement, then use the UNION ALL statement instead. This variation of
     the statement does not look for duplicate rows and runs must faster than the UNION, which does look
     for duplicate rows, whether or not there are any. [6.5, 7.0, 2000]

7.   Carefully evaluate whether your query needs the DISTINCT clause or not. The DISTINCT clause
     slows down virtually every query it is in. Some developers automatically add this clause to every one of
     their SELECT statements, even when it is not necessary. This is a bad habit that should be stopped. In
     addition, keep in mind that in some cases, duplicate results in a query are not a problem. If this is the
     case, then don't use a DISTINCT clause. [6.5, 7.0, 2000] Added 8-29-2000

8.   In your queries, don't return column data you don't need. For example, you should not use SELECT
     * to return all the columns from a table if you don't need all the data from each column. [6.5, 7.0, 2000]

9.   In your queries, don't return rows you don't need. Use the WHERE clause to return only those rows
     you need right now. [6.5, 7.0, 2000]

10. If your users perform many ad hoc queries on your SQL Server data, and you find that many of
    these "poorly-written" queries take up an excessive amount of SQL Server resources, consider using
    the "query governor cost limit" configuration option to limit how long a query can run. You can specify
    the maximum amount of "seconds" a query will run, and whenever the query optimizer determines that



                                                                                                               19
    a particular query will exceed the maximum limit, the query will be aborted before it even begins.
    Although the value you set for this setting is stated as "seconds", it does not mean seconds like we
    think of seconds. Instead, it relates to the actual estimated cost of the query as calculated by the query
    optimizer. You may have to experiment with this value until you find one that meets your needs.
    There are two ways to set this option. First, you can change it at the server level (all queries running on
    the server are affected by it) using sp_configure "query governor cost limit", or you can set it at the
    connection level (only this connection is affected) by using the SET
    QUERY_GOVERNOR_COST_LIMIT command. [7.0, 2000] Updated 6-5-2001

11. If your application allows users to run queries, but you are unable in your application to easily prevent
    users from returning hundreds, even thousands of unnecessary rows of data they don't need, consider
    using the TOP operator within the SELECT statement. This way, you can limit how may rows are
    returned, even if the user doesn't enter any criteria to help reduce the number or rows returned to the
    client. For example, the statement:
    SELECT TOP 100 fname, lname FROM customers
    WHERE state = 'mo'
    limits the results to the first 100 rows returned, even if 10,000 rows actually meet the criteria of the
    WHERE clause. When the specified number of rows is reached, all processing on the query stops,
    potentially saving SQL Server overhead, and boosting performance.
    The TOP operator works by specifying a specific number of rows, like the example above, or by
    specifying a percentage value, like this:
    SELECT TOP 10 PERCENT fname, lname FROM customers
    WHERE state = 'mo'
    In the above example, only 10 percent of the available rows would be returned. [7.0, 2000] Updated 12-
    27-2001

12. You may have heard of a SET command called SET ROWCOUNT. Like the TOP operator, it is
    designed to limit how many rows are returned from a SELECT statement. In effect, the SET
    ROWCOUNT and the TOP operator perform the same function. While is some cases, using either
    option works equally efficiently, there are some instances (such as rows returned from an unsorted
    heap) where the TOP operator is more efficient than using SET ROWCOUNT. Because of this, using
    the TOP operator is preferable to using SET ROWCOUNT to limit the number of rows returned by a
    query. [7.0, 2000] Added 12-27-2001

13. Try to avoid WHERE clauses that are non-sargable. Non-sargable search arguments in the WHERE
    clause, such as "IS NULL", "OR", "<>", "!=", "!>", "!<", "NOT", "NOT EXISTS", "NOT IN", "NOT LIKE",
    and "LIKE %500" can prevent the query optimizer from using an index to perform a search. In addition,
    expressions that include a function on a column, or expressions that have the same column on both
    sides of the operator, are not sargable.
    But not every WHERE clause that has a non-sargable expression in it is doomed to a table scan. If the
    WHERE clause includes both sargable and non-sargable clauses, then at least the sargable clauses
    can use an index (if one exists) to help access the data quickly. [6.5, 7.0, 2000]

14. If many queries are run off very large tables, consider breaking down the tables into one or more
    logical subsets using views. This works best for data that is easily partitioned, such as data based on
    date or location. This way, users can query off the appropriate view instead of the base table, saving a
    lot of I/O overhead. [6.5, 7.0, 2000]

15. If you application needs to retrieve summary data often, but you don't want to have the overhead of
    calculating it on the fly when it is needed, consider using a trigger that updates summary values after
    the initial transaction completes. While the trigger has some overhead, overall, it may be less that
    having to calculate the data every time the summary data is needed. You will have to decide which
    method is fastest in your environment. [6.5, 7.0, 2000]

16. When you have a choice of using a constraint or a trigger to perform the same task, always choose
    the constraint. The same goes if you have the option of using either a constraint or a rule, or a
    constraint or a default. Constraints require less overhead than triggers, rules, and defaults. [6.5, 7.0,
    2000]

17. Don't implement redundant integrity features in your database. For example, if you are using
    primary key and foreign key constraints to enforce referential integrity, don't add unnecessary overhead
    by also adding a trigger that performs the same function. The same goes for using both constraints and



                                                                                                                20
    defaults or constraints and rules that perform redundant work. While this may sound obvious, it is not
    uncommon to find in SQL Server databases. [6.5, 7.0, 2000]

18. If you have the choice of using a join or a subquery to perform the same task, generally the join is
    faster. But this is not always the case, you can may want to test the query using both methods to
    determine which is faster for your particular application. [6.5, 7.0, 2000]

19. If your application needs to insert a large binary value into an image data column, perform this task
    using a stored procedure, not using an INSERT statement embedded in your application. The reason
    for this is because the application must first convert the binary value into a character string (which
    doubles its size, thus increasing network traffic and taking more time) before it can be sent to the
    server. And when the server receives the character string, it then has to convert it back to the binary
    format (taking even more time). Using a stored procedure avoids all this. [6.5, 7.0, 2000]

20. If you need to create a primary key (using a value meaningless to the record, other than providing a
    unique value for a record), many developers will use either a identity field (with an integer data type) or
    a uniqueidentifier data type. If your application is not sensitive to either option, then you will most likely
    want to choose the identity field over the uniqueidentifier field. The reason for this is that the identity
    field (using the integer data type) only takes up 4 bytes, while the uniqueidentifier field takes 16 bytes.
    Using the identifier field will create a smaller and faster index. [7.0, 2000] Added 8-5-2000

21. When you have a choice of using the IN or the EXISTS clause in your Transact-SQL, you will
    generally want to use the EXISTS clause, as it is more efficient and performs faster. [6.5, 7.0, 2000]
    Added 8-7-2000

22. When you have a choice of using the IN or the BETWEEN clauses in your Transact-SQL, you will
    generally want to use the BETWEEN clause, as it is much more efficient. For example:
    SELECT customer_number, customer_name
    FROM customer
    WHERE customer_number in (1000, 1001, 1002, 1003, 1004)
    is much less efficient than this:
    SELECT customer_number, customer_name
    FROM customer
    WHERE customer_number BETWEEN 1000 and 1004
    Assuming there is a useful index on customer_number, the Query Optimizer can locate a range of
    numbers much faster (using BETWEEN) than it can find a series of numbers using the IN clause (which
    is really just another form of the OR clause). [6.5, 7.0, 2000] Added 12-8-2000

23. If possible, try to avoid using the SUBSTRING function in your WHERE clauses. Depending on
    how it is constructed, using the SUBSTRING function can force a table scan instead of allowing the
    optimizer to use an index (assuming there is one). If the substring you are searching for does not
    include the first character of the column you are searching for, then a table scan is performed. [6.5, 7.0,
    2000] Added 8-14-2000

24. Where possible, avoid string concatenation, as it is not a fast process. [6.5, 7.0, 2000] Added 8-15-
    2000

25. If possible, try to avoid using data conversion functions in the WHERE clause. If you do, you may
    be forcing the optimizer to perform a table scan rather than using an applicable index. [6.5, 7.0, 2000]
    Added 8-14-2000

26. If your application requires you to create temporary tables for use on a global or per connection
    use, consider the possibility of creating indexes for these temporary tables. While most temporary
    tables probably won't need, or even can use an index, some larger temporary tables can benefit from
    them. A properly designed index on a temporary table can be as great a benefit as a properly designed
    index on a standard database table. [6.5, 7.0, 2000] Added 8-14-2000

27. Generally, you only want to encapsulate your Transact-SQL code in a transaction if it is going to
    modify the database. Transactions help to ensure database consistency and are an important tool in
    the developer's toolbox. But if you put non-database modifying Transact-SQL in a transaction, you are
    producing unnecessary overhead for your application and SQL Server. For example, Transact-SQL




                                                                                                                21
    used for creating reports doesn't usually need the benefits (and the overhead) associated with
    transactions. [6.5, 7.0, 2000] Added 8-14-2000

28. Both the MIN() and MAX() functions can take advantage of indexes on columns. So if you perform
    these functions often, you might want to add an index to the relevant columns, assuming they don't
    already exit. [6.5, 7.0, 2000] Added 8-14-2000

29. Generally, avoid using optimizer hints in your queries. This is because it is generally very hard to
    outguess the Query Optimizer. Optimizer hints are special keywords that you include with your query to
    force how the Query Optimizer runs. If you decide to include a hint in a query, this forces the Query
    Optimizer to become static, preventing the Query Optimizer from dynamically adapting to the current
    environment for the given query. More often than not, this hurts, not helps performance.
    If you think that a hint might be necessary to optimize your query, be sure you first do all of the following
    first:
             Update the statistics on the relevant tables.
             If the problem query is inside a stored procedure, recompile it.
             Review the search arguments to see if they are sargable, and if not, try to rewrite them so that
              they are sargable.
             Review the current indexes, and make changes if necessary.
    If you have done all of the above, and the query is not running as you expect, then you may want to
    consider using an appropriate optimizer hint.
    If you haven't heeded my advice and have decided to use some hints, keep in mind that as your data
    changes, and as the Query Optimizer changes (through service packs and new releases of SQL
    Server), your hard-coded hints may no longer offer the benefits they once did. So if you use hints, you
    need to periodically review them to see if they are still performing as expected. [6.5, 7.0, 2000] Updated
    3-6-2001

30. If you want to boost the performance of a query that includes an AND operator in the WHERE
    clause, consider the following:
        Of the search criterions in the WHERE clause, at least one of them should be based on a highly
         selective column that has an index.
        If at least one of the search criterions in the WHERE clause is not highly selective, consider adding
         indexes to all of the columns referenced in the WHERE clause.
    [7.0, 2000] Added 9-11-2000

31. While views are often convenient to use, especially for restricting users from seeing data they
    should not see, they aren't good for performance. So if database performance is your goal, avoid
    using views (SQL Server 2000 Indexed Views are another story).
    Here's why. When the Query Optimizer gets a request to run a view, it runs it just as if you had run the
    view's SELECT statement from the Query Analyzer. If fact, a view runs slightly slower than the same
    SELECT statement run from the Query Analyzer--but you probably would not notice the difference--
    because of the additional overhead caused by the view. Unlike stored procedures, views offer no pre-
    optimization.
    Instead of embedding SELECT statements in a view, put them in a stored procedure instead for
    optimum performance. Not only do you get the added performance boost, you can also use the stored
    procedure to restrict user access to table columns they should not see. [6.5, 7.0, 2000] Added 5-7-2001

32. Try to avoid nesting views (referring to a view from within a view). While this is not prohibited, it
    makes it more difficult to identify the source of any performance problems. A better idea is to create
    separate views instead of nesting them. [6.5, 7.0, 2000] Added 10-9-2000

33. Don't use DISTINCT or ORDER BY in your SELECT statements unless you really need them. Both
    options can add a lot of additional overhead to your query, and they aren't always needed for your
    application. [6.5, 7.0, 2000] Added 11-27-2000

34. If your SELECT statement includes an IN option along with a list of values to be tested in the query,
    order the list of values so that the most frequently found values are placed at the first of the list, and the
    less frequently found values are placed at the end of the list. This can speed performance because the
    IN option returns true as soon as any of the values in the list produce a match. The sooner the match is
    made, the faster the query completes. [6.5, 7.0, 2000] Added 11-27-2000




                                                                                                                22
35. If you need to use the SELECT INTO option, keep in mind that it can lock system tables, preventing
    others users from accessing the data they need. If you do need to use SELECT INTO, try to schedule it
    when your SQL Server is less busy, and try to keep the amount of data inserted to a minimum. [6.5, 7.0,
    2000] Added 11-28-2000

36. If your SELECT statement contains a HAVING clause, write your query so that the WHERE clause
    does most of the work (removing undesired rows) instead of the HAVING clause do the work of
    removing undesired rows. Using the WHERE clause appropriately can eliminate unnecessary rows
    before they get to the GROUP BY and HAVING clause, saving some unnecessary work, and boosting
    performance.
    For example, in a SELECT statement with WHERE, GROUP BY, and HAVING clauses, here's what
    happens. First, the WHERE clause is used to select the appropriate rows that need to be grouped.
    Next, the GROUP BY clause divides the rows into sets of grouped rows, and then aggregates their
    values. And last, the HAVING clause then eliminates undesired aggregated groups. If the WHERE
    clause is used to eliminate as many of the undesired rows as possible, this means the GROUP BY and
    the HAVING clauses will have less work to do, boosting the overall performance of the query. [6.5, 7.0,
    2000] Added 12-11-2000

37. If you need to write a SELECT statement to retrieve data from a single table, don't SELECT the data
    from a view that points to multiple tables. Instead, SELECT the data from the table directly, or from a
    view that only contains the table you are interested in. If you SELECT the data from the multi-table view,
    the query will experience unnecessary overhead, and performance will be hindered. [6.5, 7.0, 2000]
    Added 12-11-2000

38. If your application performs many wildcard (LIKE %) text searches on CHAR or VARCHAR
    columns, consider using SQL Server's full-text search option. The Search Service can significantly
    speed up wildcard searches of text stored in a database. [7.0, 2000] Updated 1-12-2001

39. The GROUP BY clause can be used with or without an aggregate function. But if you want optimum
    performance, don't use the GROUP BY clause without an aggregate function. This is because you
    can accomplish the same end result by using the DISTINCT option instead, and it is faster.
    For example, you could write your query two different ways:
    USE Northwind
    SELECT OrderID
    FROM [Order Details]
    WHERE UnitPrice > 10
    GROUP BY OrderID
    or
    USE Northwind
    SELECT DISTINCT OrderID
    FROM [Order Details]
    WHERE UnitPrice > 10
    Both of the above queries produce the same results, but the second one will use less resources and
    perform faster. [6.5, 7.0, 2000] Added 1-12-2001

40. Generally, it is better to perform multiple UPDATEs on records in one fell swoop (using one
    query), instead of running the UPDATE statement multiple times (using multiple queries).
    For example, you could accomplish the same goal two different ways:
    USE Northwind
    UPDATE Products
    SET UnitPrice = UnitPrice * 1.06
    WHERE UnitPrice > 5

    GO

    USE Northwind
    UPDATE Products
    SET UnitPrice = ROUND(UnitPrice, 2)
    WHERE UnitPrice > 5

    GO




                                                                                                           23
    or

    USE Northwind
    UPDATE Products
    SET UnitPrice = ROUND(UnitPrice * 1.06, 2)
    WHERE UnitPrice > 5

    GO

    As is obvious from this example, the first option requires two queries to accomplish the same task as
    the second query. Running one query instead of two or more usually produces the best performance.
    [6.5, 7.0, 2000] Added 1-19-2001.

41. Sometimes perception is more important that reality. For example, which of the following two
    queries is the fastest:
        A query that takes 30 seconds to run, and then displays all of the required results.
        A query that takes 60 seconds to run, but displays the first screen full of records in less than 1
         second.
    Most DBAs would choose the first option as it takes less server resources and performs faster. But from
    many user's point-of-view, the second one may be more palatable. By getting immediate feedback, the
    user gets the impression that the application is fast, even though in the background, it is not.
    If you run into situations where perception is more important than raw performance, consider using the
    FAST query hint. The FAST query hint is used with the SELECT statement using this form:
    OPTION(FAST number_of_rows)
    where number_of_rows is the number of rows that are to be displayed as fast as possible.
    When this hint is added to a SELECT statement, it tells the Query Optimizer to return the specified
    number of rows as fast as possible, without regard to how long it will take to perform the overall query.
    Before rolling out an application using this hint, I would suggest you test it thoroughly to see that it
    performs as you expect. You may find out that the query may take about the same amount of time
    whether the hint is used or not. If this the case, then don't use the hint. [7.0, 2000] Added 3-6-2001

42. Instead of using temporary tables, consider using a derived table instead. A derived table is the
    result of using a SELECT statement in the FROM clause of an existing SELECT statement. By using
    derived tables instead of temporary tables, we can reduce I/O and boost our application's performance.
    [7.0, 2000] Added 3-9-2001 More info on derived tables.

43. When using the WHILE statement, don't avoid the use of BREAK just because some people
    consider it bad programming form. Often when creating Transact-SQL code using the WHILE
    statement, you can avoid using BREAK by moving a few lines of code around. If this works in your
    case, then by all means don't use BREAK. But if your efforts to avoid using BREAK require you to add
    additional lines of code that makes your code run slower, then don't do that. Sometimes, using BREAK
    can speed up the execution of your WHILE statements. [6.5, 7.0, 2000] Added 5-18-2001

44. Computed columns in SQL Server 2000 can be indexed if they meet all of the following criteria:
       The computed column's expression is deterministic.
       The ANSI_NULL connection-level object was on when the table was created.
       TEXT, NTEXT, or IMAGE data types are not used in the computed column.
       The physical connection used to create the index, and all connections used to INSERT, UPDATE,
        or DELETE rows in the table must have these six SET options properly configured: ANSI_NULLS =
        ON, ANSI_PADDINGS = ON, ANSI_WARNINGS = ON, ARITHABORT = ON,
        CONCAT_NULL_YIELDS_NULL = ON, QUOTED_IDENTIFIER = ON, NUMERIC_ROUNDABORT
        = OFF.
[2000] Added 10-9-2000

45. One of the advantages of using SQL Server for two-tier and three-tier applications is that you
    can offload much (if not most) of the data processing work from the other tiers and place it on
    SQL Server. The more work you can perform within SQL Server, the fewer the network roundtrips that
    need to be made between the various tiers and SQL Server. And generally the fewer the network
    roundtrips, the more scalable and faster the application becomes.
    But in some applications, such as those than involve complex math, SQL Server has traditionally been
    weak. In these cases, complex math often could not be performed within SQL Server, instead it had to
    be performed on another tier, causing more network roundtrips than desired.



                                                                                                            24
    Now that SQL Server 2000 supports user-defined functions (UDFs), this is becoming less of a problem.
    UDFs allow developers to perform many complex math functions from within SQL Server, functions that
    previously could only be performed outside of SQL Server. By taking advantage of UDFs, more work
    can stay with SQL Server instead of being shuttled to another tier, reducing network roundtrips, and
    potentially boosting your application's performance.
    Obviously, boosting your application's performance is not as simple as moving math functions to SQL
    Server, but it is one more new feature of SQL Server 2000 that developers can take advantage of in
    order to boost their application's scalability and performance. [2000] Added 12-19-2000

46. When creating scalar user-defined functions, avoid as a matter of routine applying them to large
    result sets. This is because complex user-defined functions have the potential of involving high
    overhead, and there is no way to know for sure how they will affect performance when dealing with
    large result sets. Of course, if you know for sure that the user-defined function is not a performance
    problem, then using it is OK in larger results sets. You will want to test this theory before the user-
    defined function is put into production. [2000] Added 10-9-2000

47. SQL Server 2000 offers a new data type called "table." Its main purpose is for the temporary storage of
    a set of rows. A variable, of type "table," behaves as if it is a local variable. And like local variables, it
    has a limited scope, which is within the batch, function, or stored procedure in which it was declared. In
    most cases, a table variable can be used like a normal table. SELECTs, INSERTs, UPDATEs, and
    DELETEs can all be made against a table variable.
    For best performance, if you need a temporary table in your Transact-SQL code, try to use a
    table variable instead of creating a conventional temporary table instead. Table variables are
    created and manipulated in memory instead of the tempdb database, making them much faster. In
    addition, table variables found in stored procedures result in fewer compilations (than when using
    temporary tables), and transactions using table variables only last as long as the duration of an update
    on the table variable, requiring less locking and logging resources. [2000] Added 8-7-2001

48. Don't repeatedly reuse the same function to calculate the same result over and over within your
    Transact-SQL code. For example, if you need to reuse the value of the length of a string over and over
    within your code, perform the LEN function once on the string, and this assign the result to a variable,
    and then use this variable, over and over, as needed in your code. Don't recalculate the same value
    over and over again by reusing the LEN function each time you need the value, as it wastes SQL
    Server resources and hurts performance. [6.5, 7.0, 2000] Added 8-9-2001

49. Most of you are probably familiar with the aggregate SUM() function and how it works. Occasionally, it
    would be nice if SQL Server had a PRODUCT() function, which it does not. While SUM() is used to
    sum a group a data, the theoretical PRODUCT() function would find the product of a group of data.
    One way around the problem of there not being a PRODUCT() function in SQL Server is to use some
    combination of a cursor and/or temporary tables. As you can imagine, this would not be very efficient. A
    better choice would be to use a set-based function, like the theoretical PRODUCT() function.
    With a little algebra, you can simulate a PRODUCT() function in SQL Server using the built-in SQL
    Server LOG10(), POWER(), and SUM() function working together. This is because logarithms allow you
    to find the product of numbers by summing them. This was how the products of large numbers were
    found before the days of calculators. (Are you old enough to remember using logarithm tables in
    school? I am. Ouch!)
    Below is a very simple example of how you can use a combination of the LOG10(), POWER(), and
    SUM() functions in SQL Server to simulate a PRODUCT() function. You will probably want to modify it
    to meet your specific needs, such as to eliminate null data, zero data, or data that might be negative.
    SELECT column_name1, POWER(10,SUM(LOG10(column_name2))) AS Product
    FROM table_name
    GROUP BY column_name1
    For example, let's look at the following to see how this works.
    Record 1 (1000, 2)
    Record 2 (1000, 2)
    Record 3 (1000, 2)
    Record 4 (1001, 3)
    Record 5 (1001, 3)
    Our goal here is find the product of all the records where column_name1 = 1000 and to find the product
    of all the records where column_name_name1 = 1001. When the above query is run, we get these
    results:




                                                                                                                25
    1000, 8
    1001, 9
    What has happened is that where column_name1 = 1000 (which are the first three records in our
    sample data), the values in column_name2 (which are 2 and 2 and 2) are multiplied together to return 8.
    In addition, where column_name1 = 1001 (which are the last two records in our sample data), the
    values in column_name2 (which are 3 and 3) are multiplied together to return 9.
    Creating your own PRODUCT() function produces much faster results than trying to accomplish the
    same task by using a cursor and/or temporary tables. [6.5, 7.0, 2000] Added 10-11-2001

50. Many developers choose to use an identify column at their primary key. By design, an identity column
    does not guarantee that that each newly created row will be consecutively numbered. This means they
    will most likely be occasional gaps in the identity column numbering scheme. For most applications,
    occasional gaps in the identity column present no problems.
    On the other hand, some developers don't like these occasional gaps, trying to avoid them. With some
    clever use of INSTEAD OF triggers in SQL Server 2000, it is possible prevent these numbering gaps.
    But at what cost?
    The problem with trying to force an identify column to number consecutively without gaps can lead to
    locking and scalability problems, hurting performance. So the recommendation is not to try to get
    around the identify column's built-in method of working. If you do, expect performance problems. [2000]
    Added 10-17-2001

51. Avoid using variables in the WHERE clause of a query located in a batch file. Let's find out why
    this may not be a good idea.
    First, let's look at the following code:
    SELECT employee_id
    FROM employees
    WHERE age = 30 and service_years = 10
    Assuming that both the age and the service_years columns have indexes, and the table has many
    thousands of records, then SQL Server's Query Optimizer will select the indexes to perform the query
    and return results very quickly.
    Now, let's look at the same query, but written to be more generic, one that you might find in a generic
    batch file:
    DECLARE @age int
    SET @age = "30"
    DECLARE @service_years int
    SET @service_years = "10"
    SELECT employee_id
    FROM employees
    WHERE age = @age and service_years = @service_years
    When the above code is run, even though both the age and the service_years columns have indexes,
    they won't be used, and a table scan will be used instead, potentially greatly increasing the amount of
    time for the query to run.
    The reason the indexes are not used is because the Query Analyzer does not know the value of the
    variables when it selects an access method to perform the query. Because this is a batch file, only one
    pass is made of the Transact-SQL code, preventing the Query Optimizer from knowing what it needs to
    know in order to select an access method that uses the indexes.
    If you cannot avoid using variables in the WHERE clauses of batch scripts, consider using an INDEX
    query hint to tell the Query Optimizer to use the available indexes instead of ignoring them and
    performing a table scan. This of course that the indexes are highly selective. If the indexes are not
    highly selective, then a table scan most likely be more efficient than using the available indexes. [6.5,
    7.0, 2000] Added 12-7-2001




                                                                                                           26
Performance Tuning Tips for Creating Visual Basic Applications Using
SQL Server

52. While ADO (and other VB object libraries) make database manipulation easy for the programmer, using
    these shortcuts can kill SQL Server performance. As a rule of thumb, encapsulate your DML (Data
    Manipulation Language) in stored procedures and run them from your VB application. This
    bypasses object library overhead (such as reducing cursors) and reduces the chatter between the VB
    application and SQL Server over the network.
    So what does this mean in practice? Essentially, avoid using the ADO recordset object to modify
    (INSERT, UPDATE, DELETE) data in your VB code. Instead, use Transact-SQL, encapsulated in
    stored procedures, to modify data in a SQL Server database. An ADO recordset should be used as a
    method of reading data, not modifying data. [6.5, 7.0, 2000] Updated 9-12-2001

53. When using an ADO recordset to return data from SQL Server, the most efficient way is to use
    what is often called a firehose cursor. The firehouse cursor is really an incorrect term because it is
    not a cursor. A firehose cursor is just a method to quickly move data from SQL Server to the client that
    requested it.
    Essentially, a firehose cursor sends the requested data (from the query) to an output buffer on SQL
    Server. Once the output buffer is full, it waits until the client can retrieve the data from the output buffer.
    Then the output buffer is filled again. This process repeats over and over until all of the data is sent to
    the client. Another advantage of this method is that records are only locked long enough to be moved to
    the output buffer.
    When you open an ADO RecordSet and use its default settings, a firehose cursor is automatically used
    by default. If you want to specify a firehouse cursor manually, you can do so by using these property
    settings:
             CursorType = adForwardOnly
             CursorLocation = adUseServer
             LockType = adLockReadOnly
             CacheSize = 1
    When the client receives the data from the firehose cursor, the data should be read into a local data
    structure for local use by the client. [6.5, 7.0, 2000] Updated 9-12-2001

54. When accessing data on a SQL Server, write your VB code so as to minimize the number of
    round-trips between the application and SQL Server. Each and every time you use ADO to execute
    Transact-SQL code to get data from SQL Server, multiple, time-consuming steps have to occur. For
    example:
        Your VB code must generate a request to SQL Server in the form of a Transact-SQL statement.
        The statement is sent to the database through the Connection object.
        The request from the Connection object has to be translated into packets that can be sent over the
         network.
        The packets move over the network.
        When the packets arrive at SQL Server, they must be converted back into a form useable by SQL
         Server.
        SQL Server must then process the Transact-SQL statement. Assuming a stored procedure is not
         used, then this code must be optimized and compiled, then executed.
        The results, in the form of TDS (Tabular Data Stream), are then translated into packets that can be
         sent over the network.
        The packets move over the network, again.
        When the packets arrive at SQL Server, they must be converted back into TDS format.
        When ADO received the TDS data, it is converted into a recordset, ready to be used by the
         application.
    If you know much about the technical details of networking, then you know that the above steps have
    been oversimplified. The point to remember is that round-trips between your application and SQL
    Server are expensive in time and resources, and you need to do your best in your code to minimize
    them. [6.5, 7.0, 2000] 7-19-01

55. One way to help reduce round-trips between your application and SQL Server is to move the
    data you need at the client from SQL Server in a single query, not in multiple queries. I have seen



                                                                                                                27
    some applications that only retrieve one row at a time, making a round-trip for every row needed by the
    application. This can be very expensive in resources and it hurts performance. Of course, you can't
    always know what rows will be needed ahead of time, but the better you can guess, even if you guess
    and return too many rows, returning them in one round-trip is usually more efficient than retrieving only
    one row at a time. [6.5, 7.0, 2000] 7-19-01

56. When retrieving data from a SQL Server 7 database, take full advantage of views when appropriate.
    This is especially true if you are not encapsulating your Transact-SQL in stored procedures as
    recommended. While calling a view is not usually as efficient as using a stored procedure to retrieve
    data, it is much more efficient that using embedded Transact-SQL in your ASP code or COM
    components. [6.5, 7.0, 2000]

57. Don't use DAO to access SQL Server, it is performance suicide. Also avoid ODBCDirect. Instead, use
    RDO or ADO. [6.5, 7.0, 2000]

58. When creating a connection using ADO, be sure you use the OLE DB provider, not the older ODBC
    provider for SQL Server, or the ODBC provider for OLE DB. The parameter you will use in your
    connection string is "provider=sqloledb". The OLE DB provider performs much more efficiently than the
    ODBC provider, providing better performance. [7.0, 2000] 7-19-2001

59. Use stored procedures instead of embedding Transact-SQL in your VB code. This significantly
    reduces network traffic and speeds up query execution. [6.5, 7.0, 2000]

60. If you are VB developer and need to access SQL Server data, but don't have the time or interest in
    learning how to write stored procedures, consider using the GetRows method of the RecordSet
    object. The GetRows method is used to pull all the records from the recordset into an array, which is
    much faster than using embedded Transact-SQL to download a RecordSet to your application. [6.5, 7.0,
    2000]

61. If possible in your application, use stored procedures to "batch" a set of related Transact-SQL
    statements together, instead of calling a separate stored procedure for every database task you want
    to perform. This helps to reduce network traffic and server overhead. [6.5, 7.0, 2000]

62. If you have a related group, or batch, or Transact-SQL statements you want to execute, but you don't
    want to use a stored procedure, as generally recommended for dealing with batches of Transact-SQL
    statements, one option you can use to boost performance if your VB code is to concatenate two or
    more separate Transact-SQL statements into a single batch and execute them as a single
    message. This is much more efficient that sending the Transact-SQL code to SQL Server as many
    different messages. [6.5, 7.0, 2000] Added 7-19-01

63. When SELECTing data from your application to be returned to it from SQL Server, limit the amount of
    rows returned to only those that are needed now. If necessary, force the user to enter selection criteria
    to limit the results set. [6.5, 7.0, 2000]

64. If your application allows users to run queries, but you are unable in your application to easily prevent
    users from returning hundreds, even thousands of unnecessary rows of data they don't need, consider
    using the TOP operator within the query. This way, you can limit how may rows are returned, even if
    the user doesn't enter any criteria to help reduce the number or rows returned to the client. [6.5, 7.0,
    2000]

65. If your application needs to perform looping, try to put the loop inside a stored procedure so it can be
    executed on the server without having to make round trips between the client and server. [6.5, 7.0,
    2000]

66. When creating transactions in your application, don't create them using ADO's methods. Instead,
    encapsulate the transaction in stored procedure so that it executes on the server. [6.5, 7.0, 2000]

67. Keep as much of the business logic of your application as possible off the client. In two-tier
    designs, put the business logic in stored procedures on the server. In n-tier designs, put the business
    logic in components on MTS. [6.5, 7.0, 2000]




                                                                                                               28
68. If you have the need to filter or sort data on-the-fly at the client, let ADO do this for you at the
    client. When the data is first requested by the client from the server (ideally using a stored procedure),
    have all the data the client wants to "play" with sent to the client. Once the recordset is at the client,
    then ADO methods can be used to filter or sort the data. This helps to reduce network traffic and takes
    some of the load off of the server. [6.5, 7.0, 2000]

69. By default, the CacheSize property of the ADO Recordset object determines how many rows are
    fetched from a server-side cursor at a time. The default is one. This means each row of the recordset is
    returned one at a time from the server to the client. This is very inefficient. The CacheSize property
    needs to be set to a much higher figure, such as between 100 and 500, depending on the number of
    rows that are to be eventually returned from the server to the client. [6.5, 7.0, 2000]

70. When calling SQL Server stored procedures from the ADO Command object, don't use the
    Refresh method to identify the parameters of a stored procedure. This produces extra network traffic
    and slows performance. Instead, explicitly create the parameters yourself using ADO code. [7.0, 2000]

71. ADO allows you to create four different types of SQL Server cursors. Each has its own place, and you
    will want to choose the cursor that uses the least possible resources for the task at hand. When at
    all possible, attempt to use the Forward-Only cursor, which uses the least amount of overhead of the
    four cursor types. [6.5, 7.0, 2000]

72. Avoid using the MoveFirst method of the RecordSet object when using a Forward-Only cursor.
    In effect, when you use this method, it re-executes the entire query and repopulates the Forward-Only
    cursor, increasing server overhead. [6.5, 7.0, 2000] Added 9-12-2001

73. If you create COM objects to encapsulate database access, try to follow these two suggestions if you
    want optimum speed: 1) use in-process dlls; and 2) use early-binding. [6.5, 7.0, 2000]

74. Consider using ADO's ability to create disconnected recordsets to help reduce the load on SQL
    Server. [6.5, 7.0, 2000]

75. When storing your SQL Server data into VB variables, always use strongly typed variables. Avoid
    using the variant data type (which is not always possible), as it has greater overhead than the other
    data types. [6.5, 7.0, 2000]

76. If you create object variables in your VB code to refer to COM objects that hold SQL Server data,
    be sure to strongly type them. Avoid using the AS OBJECT keywords, instead, always explicitly specify
    the type of object you want to create. [6.5, 7.0, 2000]

77. When instantiating COM objects to hold SQL Server data, create them explicitly, not implicitly. [6.5,
    7.0, 2000]

78. If you will be calling the same stored procedure, view, or SQL statements over and over again in
    your code, don't create a new Command object each time. Instead, reuse the Command object. [6.5,
    7.0, 2000]

79. When looping through recordsets, be sure you bind columns to field objects before the looping
    begins. Don't use the Fields collection of the Recordset object to assign values for fields in a Recordset
    within each loop, it incurs much more overhead. [6.5, 7.0, 2000]

80. If you know that the results of a query from within a stored procedure you call will return only
    one row of data (and not an entire recordset), don't open an ADO Recordset for the purpose of
    retrieving the data. Instead, use a stored procedure output parameter. [6.5, 7.0, 2000]

81. If your application needs to insert a large binary value into an image data column, perform this task
    using a stored procedure, not using an INSERT statement embedded in your application. The reason
    for this is because the application must first convert the binary value into a character string (which
    doubles its size, thus increasing network traffic and taking more time) before it can be sent to the
    server. And when the server receives the character string, it then has to convert it back to the binary
    format (taking even more time). Using a stored procedure avoids all this. [6.5, 7.0, 2000]




                                                                                                             29
82. When ADO is used to open more than one ForwardOnly recordset on a single Connection object at
    a time, only the first recordset is opened using the Connection object you previously created. Additional
    new connections don't use the same Connection object. Instead, separate connections are created for
    each ForwardOnly recordset you create after the first. This occurs because SQL Server can only open
    one ForwardOnly cursor per connection. The more connections you create, the greater the stress on
    SQL Server and performance and scalability suffer.
    To avoid this problem, don't use a ForwardOnly recordset. Static, Keyset, and Dynamic recordsets don't
    have this problem. Another option is to use a client side cursor instead of SQL Server cursor. Or you
    can close each recordset before opening another on the same connection. [6.5, 7.0, 2000]

83. When making your connection to SQL Server, choose DSN-less connections for the fastest
    connection. Not only does it make database connections faster, it allows you to use the OLE DB
    provider, which is not available using a DSN-based connection. The OLE DB provider is the fastest
    provider you can use to access SQL Server.
    If you do need to use a DSN, select System DSNs over File DSNs, because they are faster when
    making connections. [7.0, 2000] Updated 7-27-2001

84. When creating a Connection object, always create it explicitly, not implicitly. Opening a
    Connection object explicitly consumes less resources than opening it implicitly, and it also allows you to
    more efficiently manage multiple connections and to reassign the various roles that the Connections
    objects perform within your application. [6.5, 7.0, 2000] Added 12-14-2000

85. When using recordsets, be sure to open them explicitly, not implicitly. When recordsets are
    opened implicitly, you cannot control the default cursor and lock types, which are, respectively, forward-
    only and read-only. If you always open your recordsets explicitly, then you can specify which cursor and
    lock types you want to invoke for this particular situation, specifying the types with the least amount of
    overhead to accomplish the task at hand. [6.5, 7.0, 2000] Added 12-14-2000

86. When using ADO to make connections to SQL Server, always be sure you explicitly close any
    Connection, Recordset, or Command objects you have opened. While letting an object go out of
    scope will in affect close the object, it is not the same as explicitly closing an object. By explicitly closing
    these objects and setting them to nothing, you do two things. First, you remove the object sooner than
    later, helping to free up resources. Second, you eliminate the possibility of "connection creep".
    Connection creep occurs when connection or resource pooling is used and when connections are not
    properly closed and released from the pool. This helps to defeat the purpose of pooling and reduces
    SQL Server's performance. [6.5, 7.0, 2000]

87. If you are connecting to SQL Server via either OLE DB (version 2.0 or higher) or ODBC (version 3.0 or
    higher), SQL Server connection pooling is automatically implemented for you. Because of this, you
    don't have to write special code to implement connection pooling yourself. In addition, you don't want to
    even reuse an ADO connection object, which is commonly done by many VB developers.
    If you want to take the best advantage of database connection pooling, and optimize your VB
    application's SQL Server data access, the best advice you can receive is to be sure that you only open
    a database connection just before you need it, and then close it immediately after you are done with it.
    Don't leave database connections open if you are not using them.
    When you create or tear down a database connection in your VB code, you aren't really creating a new
    connection or tearing down a current connection. What is happening is that your connection requests
    are send to OLE DB or ODBC, and they determine if a connection needs to be created or torn down. If
    a new connection is needed, then one is created, or one is used from the current connection pool. And
    if you request that a connection be torn down, it will actually pool the unused connection until it is
    needed, or tear it down if it is not reused within a given time period. [6.5, 7.0, 2000] Updated 8-28-2001

88. In order for connection pooling to work correctly, be sure each connection you open uses the same
    ConnectionString parameters. Connection pooling only works if all of the parameters for the
    ConnectionString are identical. If they are all not identical, then a new connection will be opened,
    circumventing connection pooling. [6.5, 7.0, 2000] Added 2-5-2001

89. If appropriate for your application, locate the application's data access components on the SQL
    Server where the data is, instead of at the client. This can significantly reduce network traffic and
    overhead and boost data throughput. [6.5, 7.0, 2000]




                                                                                                                  30
90. When creating a Command object to execute a stored procedure against SQL Server, you can use
    either the adCmdText or the adCmdStoredProc property to tell ADO that you want to execute a stored
    procedure. Always use the adCmdStoredProc property, which uses RPCs between the client and SQL
    Server. This acts to bypass parameter translation and boosts performance from 20 to 30 percent over
    using the adCmdText property. [6.5, 7.0, 2000]

91. If you need to execute a stored procedure from a Command object, and if the stored procedure
    will not return any rows, you can boost performance of the Command object by setting the
    adExecuteNoRecords option. This tells the Command object to not ask for a returning rowset, which
    saves a little overhead and reduce memory usage. [6.5, 7.0, 2000]

92. If you need your VB application to generate a unique value for use in a primary key column in a
    SQL Server table, performance will be slightly better if you let SQL Server, instead of your VB
    application, create the unique value. SQL Server can generate unique keys using either an Identity
    (using the Integer data type) column or by using the NEWID function in a UniqueIdentifier column. Of
    these two, Identify columns offer better performance. [6.5, 7.0, 2000]

93. When creating COM components to access SQL Server, try to design the component to have as
    few properties as possible. For example, instead of having a property for every column of data you
    want to send back or forth between the database and your application, create one generic property that
    can be used to send all of the columns at one time. What this does is reduce the number of calls that
    must be made by the component, reducing overhead on the component and SQL Server. [6.5, 7.0,
    2000]

94. When setting Connection Object properties, use the following dot notation instead of using the fully
    qualified object property notation, whenever appropriate, as it is faster.

    Use this format for optimum speed:
    WITH cn
    .ConnectionTimeout = 100
    .ConnectionString = "xyz"
    .Cursor Location = adUseClient
    END cn
    Not this format:
    cn.ConnectionTimeout = 100
    cn.ConnectionString = "xyz"
    cn.Cursor Location = adUseClient
    If you are one of those people who need to prove this claim for yourself, try running the following code:

    Public Declare Function GetTickCount Lib "kernel32" () As Long
    Sub T1()
    Dim i As Long
    Dim cn As ADODB.Connection
    Dim Fast As Long
    Dim Slow As Long

    Set cn = New ADODB.Connection

    With cn
    .ConnectionTimeout = 100
    .ConnectionString = "xyz"
    .CursorLocation = adUseClient
    End With
    cn.ConnectionTimeout = 100
    cn.ConnectionString = "xyz"
    cn.CursorLocation = adUseClient


    For i = 1 To 100000
    Fast = Fast - GetTickCount
    With cn
    .ConnectionTimeout = 100



                                                                                                            31
    .ConnectionString = "xyz"
    .CursorLocation = adUseClient
    End With
    Fast = Fast + GetTickCount

    Slow = Slow - GetTickCount
    cn.ConnectionTimeout = 100
    cn.ConnectionString = "xyz"
    cn.CursorLocation = adUseClient
    Slow = Slow + GetTickCount
    Next
    MsgBox "Fast=" & Fast & vbCrLf & "Slow=" & Slow
    End Sub

    Thanks to Gareth Edwards for the above example code.

    [6.5, 7.0, 2000] Updated 4-5-2001

95. Don't use VB objects to act as a data holder (to store data) in your SQL Server-based applications.
    Instead, use an array or a collection of user-defined types (UDTs). While using objects you create to
    store data can be convenient, it also creates a lot of unnecessary overhead. Each time you have to
    instantiate and then destroy an object hurts performance and scalability. How do you know if an object
    you have created is storing data that should be stored elsewhere? If the class has mostly properties and
    few if any methods, then this is a good clue. [6.5, 7.0, 2000] Added 10-26-2000

96. Don't instantiate, initialize, use, and then destroy an object within a loop. If the loop repeats itself
    much at all, you create an inordinate amount of overhead for your application. Instead, reuse the same
    object in the loop. One of the best ways to do this is to include a reinitialize method for the object that
    can be called from within the loop. [6.5, 7.0, 2000] Added 10-26-2000

97. For maximum performance, consolidate most, if not all, of your COM components in a single DLL.
    COM components instantiated from a single DLL instantiate faster than if they are called from multiple
    DLLs. [6.5, 7.0, 2000] Added 10-26-2000

98. If you decide not to use a stored procedure to access SQL Server, but instead choose to use an
    embedded SQL statement in your VB code, and if that embedded SQL statement will be repeated, such
    as in a loop, consider setting the ADO Command object's "Prepared" property to "True".
    What this property does is to tell SQL Server to compile and save a copy of your SQL statement in SQL
    Server's cache. The first time the SQL statement is executed, the statement has to be compiled and
    stored in memory. But in subsequent calls, the statement is called from the cache, boosting
    performance because it does not have to be recompiled each time it is called.
    If the SQL statement will only be executed once, then don't set this option, as it will actually decrease
    performance for SQL statements that are run only once. The performance boost only comes if the SQL
    statement is run multiple times. [7.0, 2000] Added 12-27-2000

99. If you use collections in your VB 6 or ASP code, consider instead using dictionaries in order to
    boost the performance of your application. While you are probably familiar with what and how
    collections are used, you may not be familiar with dictionaries. The dictionary class (the
    Scripting.Dictionary Object) is part of the Microsoft Scripting Runtime library, which is delivered with the
    VB 6 and ASP development environments.
    Like collections, dictionaries can hold any type of data, and items can be retrieved by using a key, or
    iterated using the "For Each" syntax. But dictionaries are different in that offer properties and methods
    not available for collections. And the biggest difference is that they are much faster, about twice as fast.
    If you haven't learned about dictionaries yet, you need to take the time now to learn about their
    numerous advantages. [6.5, 7.0, 2000] Added 12-27-2000

100. To speed up string manipulation in VB (never a fast performing task), consider the following
     suggestions that can speed string manipulation performance:
        When feasible, use fixed-length strings instead of variable-length strings.
        Try to minimize string concatenation.
        Try to minimize string comparisons.




                                                                                                              32
        Try to avoid calculating the length of a string more than once. If you need this data more than once,
         and the string length does not change, then calculate the length once and store this value in a
         variable for reuse.
        When passing a string to an in-process function, try to pass it by reference instead of by value.
        Most VB string functions have two forms, one that produces a variant and one that produces a
         string (generally has a "$" after the function name). The version that produces the string is faster,
         and should generally be used.
        Consider using byte arrays instead of strings.
    [6.5, 7.0, 2000] Added 12-28-2000

101. If you still have any legacy VB applications that still use VB-SQL to access SQL Server, you may
     want to consider rewriting the app. VB-SQL not only provides slow access, it is no longer supported by
     Microsoft. [6.5] Added 1-2-2001

102. If you are the sort of VB developer who likes to design their applications around objects, you want to
     keep in mind that over-encapsulating data access within objects can hurt performance. For
     example, from an OO design approach, you might consider encapsulating data access to each
     individual table in a SQL Server database, creating a separate class for each table. While this may
     appeal to your OO design goals, it is inefficient from a performance perspective.
     Too much encapsulation can lead to situations where you don't take advantage of SQL Server's built-in
     optimization abilities, it causes too many round-trips to the database, and it can use more database
     connections than absolutely required. Instead of over-encapsulating your data access in class, a more
     efficient approach is to use stored procedures to encapsulate your business logic. Stored procedures
     eliminate these three drawbacks. [6.5, 7.0, 2000] Added 7-19-2001

103. If you use send Transact-SQL code as part of your VB ADO code directly to SQL Server, without
     using a stored procedure (which we don't recommend if you want best performance, use a stored
     procedure instead) you want ADO to execute the Transact-SQL code using the sp_executesql system
     stored procedure, not for ADO to create a temporary stored procedure in one step, execute the
     temporary stored procedure in the second step, and then drop the temporary stored procedure in a third
     step. As you can imagine, this is a total of three crosses of the network, and it can greatly increase
     overhead and hurt performance.
     How do you know if your ADO code is behaving well? Use the SQL Server Profiler to trace the activity
     between your VP application and SQL Server. If you see that SQL Server is creating temporary stored
     procedures, and not using sp_executesql, then you need to review your ADO code, looking for ways to
     optimize it. [7.0, 2000] Added 8-31-2001

104. Limit the amount of rows you return from a database to populate a pick-list or drop-down box.
     Lots of rows not only slows down your application, it also makes it less convenient for your user to
     select the item or items they need. Have you ever had to select from over 100 choices? It is not easy.
     If you need to give your user a lot of choices, instead of displaying them in one large pick-list or drop-
     down list, provide a way for the user to filter out any options that are not applicable to them. For the best
     performance, perform the filtering at the client, not the SQL Server.

    Ideally, you should use a stored procedure to retrieve the minimum amount of rows you need, then if
    there are still a lot of rows to to deal with (from the user's perspective), provide a mechanism for the
    user to filter the list using the various ADO methods available to use for local filtering. This reduces the
    number of round trips from the client to SQL Server, helping to boost performance. [6.5, 7.0, 2000]
    Added 12-11-2001




                                                                                                               33
Miscellaneous SQL Server Performance Tuning Tips

1.    If you need to delete all the rows in a table, don't use DELETE to delete them all, as the DELETE statement
     is a logged operation and can take time. To perform the same task much faster, use the TRUNCATE TABLE
     instead, which is not a logged operation. Besides deleting all of the records in a table, this command will also
     reset the seed of any IDENTITY column back to its original value. [6.5, 7.0, 2000] Updated 7-3-2001

2.   Don't run a screensaver on your production SQL Server, it can unnecessarily use CPU cycles that
     should be going to your application. The only exception to this is the "blank screen" screensaver, which
     is OK to use. [6.5, 7.0, 2000]

3.   Use sp_who and sp_who2 (sp_who2 is not documented in the SQL Server Books Online, but offers
     more details than sp_who) to provide locking and performance-related information about current
     connections to SQL Server. [6.5, 7.0, 2000]

4.   By default, you cannot use a UNC (Universal Naming Convention) name to specify a location of
     where to store a SQL Server database or log file. Instead, you must specify a drive letter that refers to a
     local physical drive or array. But what if you want to store your database or log file on another NT
     Server or a Network Appliance, Inc. storage system? You can, but you will have to set Trace Flag 1807
     on your SQL Server to allow the use of UNC names. [7.0]

5.   For a quick and dirty way to check to see if your SQL Server has maxed out its memory (and
     causing your server to page), try this. Bring up the Task Manager and go to the "Performance" tab.
     Here, check out two numbers: the "Total" under "Commit Charge (k)" and the "Total" under "Physical
     Memory (k)". If the "Total" under "Commit Charge (k)" is greater than the "Total" under "Physical
     Memory (k)", then your server does not have enough physical memory to run efficiently as it is currently
     configured and is most likely causing your server to page unnecessarily. Excess paging will slow down
     your server's performance.
                   If you notice this problem, you will probably want to use the Performance Monitor to
     further investigate the cause of this problem. You will also want to check to see how much physical
     memory has been allocated to SQL Server. Most likely, this setting has been set incorrectly, and SQL
     Server has been set to use too much physical memory. Ideally, SQL Server should be set to allocate
     physical RAM dynamically. [6.5, 7.0, 2000]

6.   Internet Information Server (IIS) has the ability to send its log files directly to SQL Server for
     storage. Busy IIS servers can actually get bogged down trying to write log information directly to SQL
     Server, and because of this, it is generally not recommend to write a web logging information to SQL
     Server. Instead, logs should be written to text files, and later imported into SQL Server using BCP or
     DTS. [6.5, 7.0, 2000]

7.   SQL Server 7 has a database compatibility mode that allows applications written for previous
     versions of SQL Server to run under SQL Server 7. In you want maximum performance for your
     database, you don't want to run your database in compatibility mode. Instead, it should be running in
     native SQL Server 7 mode. Of course, this may require you to modify your application to make it SQL
     Server 7 compliant, but in most cases, the additional work required to update your application will be
     more than paid for with improved performance. [7.0, 2000]

8.   When experimenting with the tuning of your SQL Server, you may want to run the DBCC
     DROPCLEANBUFFERS command to remove all the test data from SQL Server's data cache (buffer)
     between tests to ensure fair testing. If you want to clear out the stored procedure cache, use this
     command, DBCC FREEPROCCACHE. Both of these commands are for testing purposes and should
     not be run on a production SQL Server. [7.0, 2000]

9.   Orphan SQL Server sessions can negatively affect SQL Server's performance. An orphan SQL
     Server session can occur when a client improperly disconnects from SQL Server, such as when the
     client looses power. When this happens, the client cannot tell SQL Server to properly close the
     connection, so the SQL Server connection remains open, even though it is not being used. This can
     affect SQL Server's performance two ways. First, they use up SQL Server connections, which takes up
     server resources. Secondly, it is possible that the orphan connections may be holding locks that block
     other users, or temp tables or cursors may be held open that also take up unnecessary server



                                                                                                              34
    resources. NT Server periodically checks for inactive SQL Server sessions, and if it finds any, it will
    notify SQL Server so that the connection can be removed. Unfortunately, NT Server only performs this
    check every 1-2 hours, depending on the protocol used. If orphaned SQL Server sessions become a
    problem, NT Server's registry can be modified so that it checks more often for orphaned connections.
    Identifying an orphaned connection from SQL Server is very difficult, but if you can identify it, it can be
    removed by KILLing it using Enterprise Manager or Query Analyzer. [6.5, 7.0, 2000] Added 10-2-2000

10. If your application opens a DB-Library connection to SQL Server 7.0 and it uses either the
    TCP/IP or IPX/SPX network libraries, and you are using SQL Server 7.0 with no service pack or
    Service Pack 1, then SQL Server will experience severe memory leaks each time a new connection is
    opened, that can significantly affect its performance. To resolve this problem, the best solution is to
    install Service Pack 2, which will correct the problem. If you cannot install the service pack, then another
    option is to use another network library, such as Named Pipes or Multiprotocol. [7.0] More from
    Microsoft Added 11-15-2000

11. For best performance, don't mix production databases and development (test or staging)
    databases on the same physical server. This not only serves to better separate the two functions
    (production and development), but prevents developers from using up server resources that could be
    better used by production users. [6.5, 7.0, 2000] Added 11-20-2000

12. When we think of performance, we usually think about speeding up our application's
    performance. But another way to look at performance is to look at our performance as DBA's or
    Transact-SQL developers.
    For example, one of the easiest ways to speed up our Transact-SQL coding, in addition to maintaining
    and troubleshooting our code once it is written, it to format it in an easy to read format.
    While there are many different code formatting guidelines available, here are some basic ones you
    should consider following, if you aren't doing so already:
       Begin each line of your Transact-SQL code with a SQL verb, and capitalize all Transact-SQL
        statements and clauses, such as:
    SELECT customer_number, customer_name
    FROM customer
    WHERE customer_number > 1000
    ORDER BY customer_number
        If a line of Transact-SQL code is too long to fit onto one line, indent the following line(s), such as:
    SELECT customer_number, customer_name, customer_address,
    customer_state, customer_zip, customer_phonenumber
       Separate logical groupings of Transact-SQL code by using appropriate comments and
        documentation explaining what each grouping goes.
    These are just a few of the many possible guidelines you can follow when writing your Transact-SQL
    code to make it more readable by you and others. You just need to decide on some standard, and then
    always follow it in your coding. If you do this, you will definitely boost your coding performance. [6.5, 7.0,
    2000] Added 12-5-2000

13. Be wary of allowing users to directly access your databases (especially OLTP databases) with
    third-party database access tools, such as Microsoft Excel or Access. Many of these tools can wreck
    havoc with your database's performance. Here are some reasons why:
         Often these users aren't experienced using these tools, and create overly complex queries that eat
             up server resources. At the other extreme, their queries may not be complex enough (such as
            lacking effective WHERE clauses) and return thousands, if not millions, or unnecessary rows of
                                                            data.
          This reporting activity can often lock rows, pages, or tables, creating user contention for data and
                                           reducing the database's performance.
           These tools are often file-based. This means that even if an effective query is written, the entire
           table (or multiple tables in the case of joins) have to be returned to the client software where the
         query is actually performed, not at the server. This can not only lead to excess server activity, but it
                                              can play havoc on your network.
    If you have no choice but to allow users access to your data, try to avoid them hitting your production
    OLTP databases. Instead, point them to a "reporting" server that has been replicated, or is in the form
    of a datamart or data warehouse. [6.5, 7.0, 2000] Added 4-18-2001




                                                                                                                   35
14. SQL Server 2000 offers support of SSL encryption between clients and the server. While
    selecting this option prevents the data from being viewed, it also adds additional overhead and
    reduces performance. Only use SSL encryption if absolutely required. If you need to use SSL
    encryption, consider purchasing a SSL encryption processor for the server to speed performance.
    [2000] Added 9-21-2000

15. SQL Server 2000 supports named instances of SQL Server. For example, this feature allows you to
    run both SQL Server 6.5 and SQL Server 2000 on the same server; or to run SQL Server 7.0 and SQL
    Server 2000 on the same server; or to run SQL Server 6.5, SQL Server 7.0, and SQL Server 2000 on
    the same server; and to run up to 16 concurrent instances of SQL Server 2000 on the same server.
    As you might imagine, each running instance of SQL Server takes up server resources. Although some
    resources are shared by multiple running instances, such as MSDTC and the Microsoft Search
    services, most are not. Because of this, each additional instance of SQL Server running on the same
    server have to fight for available resources, hurting performance. For best performance, run only a
    single instance (usually the default) on a single physical server. The main reasons for using named
    instances is for upgrading older versions of SQL Server to SQL Server 2000, transition periods where
    you need to test your applications on multiple versions of SQL Server, and for use on developer's
    workstations. [2000] Added 11-14-2000

16. If you run the ALTER TABLE DROP COLUMN statement to drop a variable length or text column, did
    you know that SQL Server will not automatically reclaim this space after performing this action. To
    reclaim this space, which will help to reduce unnecessary I/O due to the wasted space, you can run the
    following command, which is new to SQL Server 2000.
                            DBCC CLEANTABLE (database_name, table_name)
     Before running this command, you will want to read about it in Books Online to learn about some of its
                       options that may be important to you. [2000] Added 2-28-2001

17. Trace flags, which are used to enable and disable some special database functions temporarily,
    can often chew up CPU utilization and other resources on your SQL Server unnecessarily. If you
    just use them for a short time to help diagnose a problem, for example, and then turn them off as soon
    as your are done using them, then the performance hit you experience is small and temporary.
    What happens sometimes is that you, or another DBA, turns on a trace flag, but forgets to turn it off.
    This of course, can negatively affect your SQL Server's performance. If you want to check to see if there
    are any trace flags turned on on a SQL Server, run this command in Query Analyzer:
    DBCC TRACESTATUS(-1)
    If there are any trace flags on, you will see them listed on the screen after running this command. DBCC
    TRACESTATUS only finds traces created at the client (connection) level. If a trace has been turned on
    for an entire server, this will not show up.
    If you find any, you can turn them off using this command:
    DBCC TRACEOFF(number of trace)
    [7.0, 2000] Added 6-6-2001

18. SQL Server offers a feature called the black box. When enabled, the black box creates a trace file of
    the last 128K worth of queries and exception errors. This can be a great tool for troubleshooting some
    SQL Server problems, such as crashes.
    Unfortunately, this feature uses up SQL Server resources to maintain the trace file than can negatively
    affect its performance. Generally, you will want to only turn the black box on when troubleshooting, and
    turn it off during normal production. This way, your SQL Server will be minimally affected. [7.0, 2000]
    Added 6-6-2001

19. If you have ever performed a SELECT COUNT(*) on a very large table, you know how long it can
    take. For example, when I ran the following command on a large table I manage:
    SELECT COUNT(*) from <table_name>
    It took 1:09 to count 10,725,948 rows in the table. At the same time, SQL Server had to perform a lot of
    logical and physical I/O in order to perform the count, chewing up important SQL Server resources.
    A much faster, and more efficient, way of counting rows in a table is to run the following query:
    SELECT rows
    FROM sysindexes
    WHERE id = OBJECT_ID('<table_name>') AND indid < 2
    When I run the query against the same table, it takes less than a second to run, and it gave me the
    same results. Not a bad improvement, and it took virtually no I/O. This is because the row count of your




                                                                                                          36
    tables is stored in the sysindexes system table of your database. So instead of counting rows when you
    need to, just look up the row count in the sysindexes table.
    The is one potential downside to using the sysindexes table. And that this system table is not updated in
    real time, so it might underestimate the number of rows you actually have. Assuming you have the
    database option turned on to "Auto Create Statistics" and "Auto Update Statistics", the value you get
    should be very close to being correct, if not correct. If you can live with a very close estimate, then this
    is the best way to count rows in your tables. [7.0, 2000] Added 7-3-2001

20. Looking for some new tools to help performance tune your operating system? Then check out the
    performance tools at Sysinternals. For example, they have tools to defrag your server's swap file,
    among many others. And best of all, most are free. [6.5, 7.0, 2000] Added 7-3-2001

21. Do you use Enterprise Manager to access remote servers, possibly over a slow WAN link? If you
    do, have you ever had any problems getting Enterprise Manager to connect to the remote server?
    If so, the problem may lay in the fact that if Enterprise Manager cannot make a connection within 4
    seconds, then the connection attempt fails. To overcome slow network connections, you can change
    the default Enterprise Manager timeout value from 4 seconds to any amount of time you like.
    To change the default timeout value, select Tools|Options from Enterprise Manager, and then select the
    "Connection" tab if you have SQL Server 7.0, or the "Advanced" tab if you have SQL Server 2000.
    Here, change the "Login time-out (seconds)" option to a higher number. [7.0, 2000] Added 7-3-2001

22. SQLDIAG.exe is a command line tools that collects information about SQL Server and writes it
    to a text file. It can be useful for documenting or troubleshooting your SQL Server. When you run this
    command when SQL Server is running, the following information is collected and stored in a text file
    called sqldiag.txt, which is stored in the \mssql\log folder.
        The text of all error logs
        SQL Server registry information
        SQL Server dll version information
        The output from these system stored procedures:
                  sp_configure
                  sp_who
                  sp_lock
                  sp_helpdb
                  xp_msver
                  sp_helpextendedproc
        sysprocesses
        Input buffer SPIDs/deadlock information
        Microsoft diagnostics report for the server
        The last 100 queries and exceptions (if the query history trace was running)
[        7.0, 2000] Added 10-12-2001

23. If you are running any service pack of SQL Server 7.0 on a multiprocessor server, the number of
    configured processors displayed by Enterprise Manager my be incorrect. For example, while you
    may have two CPUs in your server, Enterprise Manager may only report one CPU. This is a known bug
    that has yet to be fixed in Service Packs 1 through 3 in SQL Server 7.0.
    The work-around to this problem is not to use the Enterprise Manager to specify the number of CPUs
    for SQL Server to use, but to use the sp_configure "max degree of parallelism" option instead, such as
    in this example:
    USE master
    GO

    SP_CONFIGURE 'show advanced option', 1
    GO

    RECONFIGURE WITH OVERRIDE
    GO

    SP_CONFIGURE 'max degree of parallelism', 0
    GO
    The "max degree of parallelism" option is an advanced option, so the first portion of the code above is
    used to turn on the "show advanced option." Once that is done, then you can set the "max degree of



                                                                                                             37
     parallelism" option. By setting this option to "0", you are telling SQL Server to use all available CPUs in
     the       server.       See        this       Microsoft        article    for       more        information:
     http://support.microsoft.com/support/kb/articles/Q273/8/80.ASP [7.0] Added 10-23-2001

24. Memory leaks can steal valuable memory from your SQL Server, reducing performance, and
    perhaps even forcing you to reboot your server. A memory leak occurs when a poorly-written or buggy
    program requests memory from the operating system, but does not release the memory when it is done
    with it. Because of this, the application can use up more and more memory in a server, greatly slowing it
    down, and even perhaps crashing the server.
          Some memory leaks come from the operating system itself, device drivers, MDAC components,
    and even SQL Server. And of course, virtually any application can cause a memory leak, which is
    another good reason to dedicate a single server to SQL Server instead of sharing it among multiple
    applications.
          Memory leaks are often hard to identify, especially if they leak memory slowly. Generally, memory
    leaks become apparent when you notice that your server is running out of available memory and paging
    becomes a big problem. A symptom of this is a SQL Server that runs quickly after being rebooted, but
    begins to run more and more slowly as time passes, and when the system is rebooted again, it speeds
    up again.
          One way to help get rid of many memory leaks is to ensure that you always have the latest service
    packs or updates for your server's software. But a memory leak you find may not have an immediate fix.
    If this is the case, you may be forced to reboot your server periodically in order to free up memory.
    Identifying what is causing a memory leak is often difficult. One method involved using Performance
    Monitor to monitor all of the counters in the Memory object over time, seeing what is happening
    internally in your computer. Another method is to use Task Manager to view how much memory is used
    by each process. A process that seems to be using an unusual amount of memory may be the culprit.
    [6.5, 7.0, 2000] Added 12-11-2001




Tips on Optimizing SQL Server Indexes
1.   Indexes should be considered on all columns that are frequently accessed by the WHERE,
     ORDER BY, GROUP BY, TOP, and DISTINCT clauses. Without an index, each of these operations
     will require a table scan of your table, potentially hurting performance.

     Keep in mind the word "considered". An index created to support the speed of a particular query may
     not be the best index for another query on the same table. Sometimes you have to balance indexes to
     attain acceptable performance on all the various queries that are run against a table. [6.5, 7.0, 2000]
     Updated 12-7-2001

2.   To help identify which tables in your database may need additional or improved indexes, use the
     SQL Server Profiler Create Trace Wizard to run the "Identify Scans of Large Tables" trace. This trace
     will tell which tables are being scanned by queries instead of using an index to seek the data. This
     should provide you data you can use to help you identify which tables may need additional or better
     indexes. [7.0, 2000]

3.   Don't over index your OLTP tables, as every index you add increases the time in takes to perform
     INSERTS, UPDATES, and DELETES. There must be a fine line drawn between having the ideal
     number of indexes (for SELECTs) and the ideal number for data modifications. [6.5, 7.0, 2000]

4.   Don't automatically add indexes on a table because it seems like the right thing to do. Only add
     indexes if you know that they will be used by the queries run against the table. [6.5, 7.0, 2000]

5.   Don't accidentally add the same index twice on a table. This is easier to happen than you think. For
     example, you add a unique or primary key to an column, which of course creates an index to enforce
     what you want to happen. But without thinking about it when evaluating the need for indexes on a table,
     you decide to add a new index, and this new index happens to be on the same column as the unique or
     primary key. As long as you give indexes different names, SQL Server will allow you to create the same
     index over and over. [7.0, 2000] Added 4-2-2001




                                                                                                              38
6.   If you have a table that is subject to many INSERTS, be sure that you have added a clustered index
     to it (not based on an incrementing key), whether or not it really needs one. A table that does not have a
     clustered index is called a heap. Every time data is INSERTed into a heap, the row is added to the end
     of the table. If there are many INSERTS, this spot could become a "hot spot" which could significantly
     affect performance. By adding a clustered index to the table (not based on an incrementing key), any
     potential hotspots are avoided. [6.5, 7.0, 2000] Added 2-22-2001

7.   Drop indexes that are never used by the Query Optimizer. Unused indexes slow data modifications
     and waste space in your database, increasing the amount of time it takes to backup and restore
     databases. Use the Index Wizard ( 7.0 and 2000) to help identify indexes that are not being used. [6.5,
8.   Generally, you probably won't want to add an index to a table under these conditions:
        If the index is not used by the query optimizer. Use Query Analyzer's "Show Execution Plan" option
         to see if your queries against a particular table use an index or not. If the table is small, most likely
         indexes will not be used.
        If the column values exhibit low selectivity, often less than 90%-95% for non-clustered indexes.
        If the column(s) to be indexed are very wide.
        If the column(s) are defined as TEXT, NTEXT, or IMAGE data types.
        If the table is rarely queried.

9.   While high index selectivity is generally an important factor that the Query Optimizer uses to determine
     whether or not to use an index, there is one special case where indexes with low selectivity can be
     useful speeding up SQL Server. This is the case for indexes on foreign keys. Whether an index on a
     foreign key has either high or low selectivity, an index on a foreign key can be used by the
     Query Optimizer to perform a merge join on the tables in question. A merge join occurs when a
     row from each table is taken and then they are compared to see if they match the specified join criteria.
     If the tables being joined have the appropriate indexes (no matter the selectivity), a merge join can be
     performed, which is generally much faster than a join to a table with a foreign key that does not have an
     index. [7.0, 2000] Added 4-9-2001

10. On data warehousing databases, which are essentially read-only, having an many indexes as
    necessary for covering virtually any query is not normally a problem. [6.5, 7.0, 2000]

11. To provide the up-to-date statistics the query optimizer needs to make smart query optimization
    decisions, you will generally want to leave the "Auto Update Statistics" database option on. This
    helps to ensure that the optimizer statistics are valid, helping to ensure that queries are properly
    optimized when they are run.
         But this option is not a panacea. When a SQL Server database is under very heavy load,
    sometimes the auto update statistics feature can update the statistics at inappropriate times, such as
    the busiest time of the day.
         If you find that the auto update statistics feature is running at inappropriate times, you may want to
    turn it off, and then manually update the statistics (using UPDATE STATISTICS or sp_updatestats)
    when the database is under a less heavy load.
         But again, consider what will happen if you do turn off the auto update statistics feature? While
    turning this feature off may reduce some stress on your server by not running at inappropriate times of
    the day, it could also cause some of your queries not to be properly optimized, which could also put
    extra stress on your server during busy times.
         Like many optimization issues, you will probably need to experiment to see if turning this option on
    or off is more effective for your environment. But as a rule of thumb, if your server is not maxed out,
    then leaving this option is probably the best decision. [7.0, 2000] More info from Microsoft

12. Keep the "width" of your indexes as narrow as possible, especially when creating composite (multi-
    column) indexes. This reduces the size of the index and reduces the number of reads required to read
    the index, boosting performance. [6.5, 7.0, 2000]

13. If possible, try to create indexes on columns that have integer values instead of characters. Integer
    values have less overhead than character values. [6.5, 7.0, 2000]

14. An index is generally only useful to a query if the WHERE clause of the query matches the
    column(s) that are leftmost in the index. So if you create a composite index, such as "City, State",
    then a query such as "WHERE City = 'Springfield'" will use the index, but the query "WHERE STATE =
    'MO'" will not use the index. [6.5, 7.0, 2000]




                                                                                                               39
15. Even if the WHERE clause in a query does not specify the first column of an available index (which
    normally disqualifies the index from being used), if the index is a composite index and contains all of
    the columns referenced in the query, the query optimizer can still use the index, because the index is
    a covering index. [6.5, 7.0, 2000]

16. When you create an index with a composite key, the order of the columns in the key is important. Try
    to order the columns in the key as to enhance selectivity, with the most selective columns to the
    leftmost of the key. If you don't due this, and put a non-selective column at the first part of the key, you
    risk having the Query Optimizer not use the index at all. Generally, a column should be at least 95%
    unique in order for it to be considered selective. [6.5, 7.0, 2000] Updated 3-6-2001 More info on
    selectivity

17. As you may know, an index is automatically created for column(s) in your table that you specify
    as a PRIMARY KEY or as UNIQUE. If two or more columns are involved, and a composite index is
    created, you should choose how the columns will be ordered in the composite index, instead of
    accepting the default choices offered by SQL Server. This is because you always want to use the most
    selective columns at the left of the key to ensure that the composite index is selective enough to be
    used by the Query Optimizer. If you don't do this, then the default order of the columns in the composite
    index may not be very selective, and the index may not be used by Query Optimizer. While the order of
    the columns in the index are important to the Query Optimizer, the order is not important to the
    PRIMARY KEY or UNIQUE constraints. [6.5, 7.0, 2000] Added 3-6-2001

18. If you have two or more tables that are frequently joined together, then the columns used for the
    joins should have an appropriate index. If the columns used for the joins are not naturally compact, then
    considering adding surrogate keys to the tables that are compact in order to reduce the size of the keys,
    thus decreasing I/O during the join process, increasing overall performance. [6.5, 7.0, 2000]

19. When creating indexes, try to make them unique indexes if at all possible. SQL Server can often
    search through a unique index faster than a non-unique index because in a unique index, each row is
    unique, and once it is found, SQL Server doesn't have to look any further. [6.5, 7.0, 2000] Added 10-19-
    2000

20. If a particular query against a table is run infrequently, and the addition of an index greatly speeds
    the performance of the query, but the performance of INSERTS, UPDATES, and DELETES is
    negatively affected by the addition of the index, consider creating the index for the table for the duration
    of when the query is run, then dropping the index. An example of this is when monthly reports are run at
    the end of the month on an OLTP application. [6.5, 7.0, 2000] Added 9-11-2000

21. If you like to get under the cover of SQL Server to learn more about indexing, take a look at the
    sysindex system table that is found in every database. Here, you can find a wealth of information on
    the indexes and tables in your database. To view the data in this table, run this query from the database
    you are interested in: SELECT *
    FROM sysindexes. Here are some of the more interesting fields found in this table:
        dpages: If the indid value is 0 or 1, then dpages is the count of the data pages used for the index. If
         the indid is 255, then dpages equals zero. In all other cases, dpages is the count of the non-
         clustered index pages used in the index.
        id: Refers to the id of the table this index belongs to.
        indid: This column indicates the type of index. For example, 1 is for a clustered table, a value
         greater than 1 is for a non-clustered index, and a 255 indicates that the table has text or image
         data.
        OrigFillFactor: This is the original fillfactor used when the index was first created, but it is not
         maintained over time.
        statversion: Tracks the number of times that statistics have been updated.
        status: 2 = unique index, 16 = clustered index, 64 = index allows duplicate rows, 2048 = the index
         is used to enforce the Primary Key constraint, 4096 = the index is used to enforce the Unique
         constraint. These values are additive, and the value you see in this column may be a sum of two or
         more of these options.
        used: If the indid value is 0 or 1, then used is the number of total pages used for all index and table
         data. If indid is 255, used is the number of pages for text or image data. In all other cases, used is
         the number of pages in the index.




                                                                                                              40
22. Don't use FLOAT or REAL data types for primary keys, as they add unnecessary overhead and can
    hurt performance. [6.5, 7.0, 2000] Added 10-4-2000

23. If your WHERE clause includes an AND operator, one way to optimize it is to ensure that at least
    one of the search criterion is highly selective and includes an index for the relevant column. [6.5, 7.0,
    2000] Added 10-17-2000

24. The Query Optimizer will always perform a table scan or a clustered index scan on a table if the
    WHERE clause in the query contains an OR operator and if any of the referenced columns in the OR
    clause are not indexed (or does not have a useful index). Because of this, if you use many queries with
    OR clauses, you will want to ensure that each referenced column has an index. [7.0, 2000] Added 10-
    17-2000

25. The Query Optimizer converts the Transact-SQL IN clause to the OR operator when parsing your
    code. Because of this, keep in mind that if the referenced column in your query doesn't include an
    index, then the Query Optimizer will perform a table scan or clustered index scan on the table. [6.5, 7.0,
    2000] Added 10-17-2000

26. If you use the SOUNDEX function against a table column in a WHERE clause, the Query
    Optimizer will ignore any available indexes and perform a table scan. If your table is large, this can
    present a major performance problem. If you need to perform SOUNDEX type searches, one way
    around this problem is to precalculate the SOUNDEX code for the column you are searching and then
    place this value in a column of its own, and then place an index on this column in order to speed
    searches. [6.5, 7.0, 2000] Added 11-9-2000

27. If you need to create indexes on large tables, you may be able to speed up their creation by using
    the new SORT_IN_TEMPDB option available with the CREATE INDEX command. This option tells SQL
    Server to use the tempdb database, instead of the current database, to sort data while creating indexes.
    Assuming your tempdb database is isolated on its own separate disk or disk array, then the process of
    creating the index can be sped up. The only slight downside to using this option is that it takes up
    slightly more disk space than if you didn't use it, but this shouldn't be much of an issue in most cases. If
    your tempdb database is not on its own disk or disk array, then don't use this option, as it can actually
    slow performance. [2000] Added 10-19-2000

28. SQL Server 2000 Enterprise Edition (not the standard edition) offers the ability to create indexes in
    parallel, greatly speeding index creation. Assuming your server has multiple CPUs, SQL Server 2000
    uses near-linear scaling to boost index creation speed. For example, using two CPUs instead of one
    CPU almost halves the speed it takes to create indexes. [2000] Added 12-19-2000

29. As you probably already know, indexes on narrow columns are preferable to indexes on wide columns.
    The narrower the index, the more entries SQL Server can fit on a data page, which in turn reduces the
    amount of I/O required to access the data. But sometimes the column you want to search on using an
    index is much wider than desirable.
          For example, let's say you have a music database that lists the titles of over 5,000,000 songs, and
    that you want to search by song title. Also assume that the column used to store the music titles is a
    VARCHAR(45). Forty-five characters is a very wide index, and creating an index on such a wide column
    is not wise from a performance perspective. So how do we deal with such a scenario?
          SQL Server 2000 offers a new function called CHECKSUM. The main purpose for this function
    is to create what are called hash indices. A hash indices is an index built on a column that stores the
    checksum of the data found in another column in the table. The CHECKSUM function takes data from
    another column and creates a checksum value. In other words, the CHECKSUM function is used to
    create a mostly unique value that represents other data in your table. In most cases, the CHECKSUM
    value will be much smaller than the actual value. For the most part, checksum values are unique, but
    this is not guaranteed. It is possible that two slightly different values may produce the same identical
    CHECKSUM value.
          Here's how this works using our music database example. Say we have a song with the title "My
    Best Friend is a Mule from Missouri". As you can see, this is a rather long value, and adding an index to
    the song title column would make for a very wide index. But in this same table, we can add a
    CHECKSUM column that takes the title of the song and creates a checksum based on it. In this case,
    the checksum would be 1866876339. The CHECKSUM function always works the same, so if you
    perform the CHECKSUM function on the same value many different times, you would always get the
    same result.



                                                                                                                41
    So how does the CHECKSUM help us? The advantage of the CHECKSUM function is that instead of
    creating a wide index by using the song title column, we create an index on the CHECKSUM column
    instead. "That's fine and dandy, but I thought you wanted to search by the song's title? How can
    anybody ever hope to remember a checksum value in order to perform a search?"
    Here's how. Take a moment to review this code:

    SELECT title, artist, composer
    FROM songs
    WHERE title = 'My Best Friend is a Mule from Missouri'
    AND checksum_title = CHECKSUM('My Best Friend is a Mule from Missouri')

         In this example, it appears that we are asking the same question twice, and in a sense, we are. The
    reason we have to do this is because there may be checksum values that are identical, even though the
    names of the songs are different. Remember, unique checksum values are not guaranteed.
         Here's how the query works. When the Query Optimizer examines the WHERE clause, it
    determines that there is an index on the checksum_title column. And because the checksum_title
    column is highly selective (minimal duplicate values) the Query Optimizer decides to use the index. In
    addition, the Query Optimizer is able to perform the CHECKSUM function, converting the song's title
    into a checksum value and using it to locate the matching records in the index. Because an index is
    used, SQL Server can very quickly locate the rows that match the second part of the WHERE clause.
    Once the rows have been narrowed down by the index, then all that has to be done is to compare these
    matching rows to the first part of the WHERE clause, which will take very little time.
         This may seem a lot of work to shorten the width of an index, but in many cases, this extra work will
    pay off in better performance in the long run. Because of the nature of this tip, I suggest you experiment
    using this method, and the more conventional method of creating an index on the title column itself.
    Since there are so many variables to consider, it is tough to know which method is better in your
    particular situation unless you give them both a try. [2000] Added 3-6-2001

30. There is a bug in SQL Server 7.0 and 2000 that has yet to be corrected, that can negatively affect
    the performance of some queries. Queries that have multiple OR clauses in them that are based on a
    clustered composite index may ignore the index and perform a table scan instead. This bug only
    appears if the query is within a stored procedure, or if it is executed through an ODBC-based
    application, such as VB, ASP, or Microsoft Access.
    The best way to identify if you are experiencing this bug is to view the Query Plan for slow queries that
    fit the criteria above, and see if a table scan is being performed, or if the index is being used.
    There are five different possible workarounds for this bug, depending on your circumstances:
         Use an appropriate index hint to force the use of the composite index. This is probably the easiest
          method to get around this problem, and the one I recommend.
         Changing from a clustered composite index to a non-clustered composite index may help, but this
          is not guaranteed. You will have to test it for yourself.
         Rewrite the query using a UNION clause to combine the results returned from the OR clauses. Of
          course, using a UNION clause may itself degrade performance. You will have to test this option to
          see if it is faster or not.
         If the query is being executed from an ODBC application through the SQLPrepare function with the
          SQL Server ODBC driver version 3.6 or earlier, then you can disable the "Generate Stored
          Procedures for Prepared Statements" option to workaround the bug.
         If the query is being executed from an ODBC application through either the SQLPrepare or
          SQLExecDirect functions with a parameterized query using the SQL Server ODBC driver version
          3.7, you can use the odbccmpt utility to enable the SQL Server 6.5 ODBC compatibility option and
          also disable the "Generate Stored Procedures for Prepared Statements" option to workaround the
          bug.




                                                                                                           42
SQL Server Performance Tuning Tips for Stored Procedures

1.   Whenever a client application needs to send Transact-SQL to SQL Server, send it in the form of a
     stored procedure instead of a script or embedded Transact-SQL. This not only reduces network traffic
     (only the EXECUTE or CALL is issued over the network), but it can speed up the Transact-SQL
     because the code in the stored procedure residing on the server is already pre-compiled. In addition,
     after a stored procedure is run for the first time, it stays cached in SQL Server’s memory where it can
     potentially be reused, further reducing overhead on the SQL Server.
     Keep in mind that just because you use a stored procedure does not mean that it will run fast. The code
     you use within your stored procedure must be well designed for both speed and reuse. [6.5, 7.0, 2000]
     Updated 5-18-2001

2.   To help identify performance problems with stored procedures, use the SQL Server's Profiler Create
     Trace Wizard to run the "Profile the Performance of a Stored Procedure" trace to provide you with the
     data you need to identify poorly performing stored procedures. [7.0, 2000]

3.   Include in your stored procedures the statement, "SET NOCOUNT ON". If you don't turn this command
     on, then every time a SQL statement is executed, SQL Server will send a response to the client
     indicating the number of rows affected by the statement. It is rare that this information will ever be
     needed by the client. Using this statement will help reduce the traffic between the server and the client.
     [6.5, 7.0, 2000]

4.   Keep Transact-SQL transactions as short as possible. This helps to reduce the number of locks,
     helping to speed up the overall performance of your SQL Server application. Two ways to help reduce
     the length of a transaction are to: 1) break up the entire job into smaller steps so each step can be
     committed as soon as possible; and 2) take advantage of SQL Server statement batches, which acts to
     reduce the number of round-trips between the client and server. [6.5, 7.0, 2000]

5.   When a stored procedure is first executed (and it does not have the WITH RECOMPILE option), it is
     optimized and a query plan is compiled and cached in SQL Server's buffer. If the same stored
     procedure is called again from the same connection, it will used the cached query plan instead of
     creating a new one, saving time and boosting performance. This may or may not be what you want.
     If the query in the stored procedure is exactly the same each time, then this is a good thing. But if the
     query is dynamic (the WHERE clauses changes from one execution of the stored procedure to the
     next), then this is a bad thing, as the query will not be optimized when it is run, and the performance of
     the query can suffer greatly.
          If you know that your query will vary each time it is run from the stored procedure, you will want to
     add the WITH RECOMPILE option when you create the stored procedure. This will force the stored
     procedure to be re-compiled each time it is run, ensuring the query is optimized each time it is run. [6.5,
     7.0, 2000]

6.   Design your application to allow your users to cancel running queries. Not doing so may force the
     user to reboot the client, which can cause unresolvable performance problems. [6.5, 7.0, 2000]

7.   Many stored procedures have the option to accept multiple parameters. This in and of itself is not a bad
     thing. But what can often cause problems is if the parameters are optional, and the number of
     parameters varies greatly each time the stored procedure runs. There are two ways to handle this
     problem, the slow performance way and fast performance way.
     If you want to save your development time, but don't care about your application's performance, you
     can write your stored procedure generically so that it doesn't care how many parameters it gets. The
     problem with this method is that you may end up unnecessarily joining tables that don't need to be
     joined based on the parameters submitted for any single execution of the stored procedure.
           Another, much better performing way, although it will take you more time to code, is to include
     IF...ELSE logic in your stored procedure, and create separate queries for each possible combination of
     parameters that are to be submitted to the stored procedure. This way, you can be sure you query is as
     efficient as possible each time it runs. [6.5, 7.0, 2000] Added 12-29-2000

8.   Here's another way to handle the problem of not knowing what parameters your stored
     procedure might face. In fact, it will probably perform faster than the tip listed above.



                                                                                                             43
Although the above tip is a good starting point, it's not complete. The problem is the query-plans, the
pre-compilation of stored procedures, that SQL Server does for you. As you know, one of the biggest
reasons to use stored procedures instead of ad-hoc queries is the performance gained by using them.
The problem that arises with the above tip is that SQL Server will only generate a query-plan for the
path taken through your stored procedure when you first call it, not all possible paths.

Let me illustrate this with an example. Consider the following procedure (pre-compilation doesn't really
have a huge effect on the queries used here, but these are just for illustration purposes):

CREATE PROCEDURE dbo.spTest (@query bit) AS
IF @query = 0
SELECT * FROM authors
ELSE
SELECT * FROM publishers
GO

Suppose I make my first call to this procedure with the @query parameter set to 0. The query-plan that
SQL Server will generate will be optimized for the first query ("SELECT * FROM authors"), because the
path followed on the first call will result in that query being executed.
Now, if I next call the stored procedure with @query set to 1, the query plan that SQL Server has in
memory will not be of any use in executing the second query, since the query-plan is optimized for the
authors table, not the publishers table. Result: SQL Server will have to compile a new query plan, the
one needed for the second query. If I next call the procedure with @query set to 0 again, the whole path
will have to be followed from the start again, since only one query-plan will be kept in memory for each
stored procedure. This will result in sub-optimal performance.

As it happens I have a solution, one that I've used a lot with success. It involves the creation of what I
like to call a 'delegator'. Consider again spTest. I propose to rewrite it like this:

CREATE PROCEDURE dbo.spTestDelegator (@query bit) AS
IF @query = 0
EXEC spTestFromAuthors
ELSE
EXEC spTestFromPublishers
GO

CREATE PROCEDURE dbo.spTestFromAuthors AS
SELECT * FROM authors
GO

CREATE PROCEDURE dbo.spTestFromPublishers AS
SELECT * FROM publishers
GO

The result of this restructuring will be that there will always be an optimized query-plan for
spTestFromAuthors and spTestFromPublishers, since they only hold one query. The only one getting
re-compiled over and over again is the delegator, but since this stored procedure doesn't actually hold
any queries, that won't have a noticeable effect on execution time. Of course re-compiling a plan for a
simple 'SELECT *' from a single table will not give you a noticeable delay either (in fact, the overhead of
an extra stored procedure call may be bigger then the re-compilation of "SELECT * FROM AnyTable"),
but as soon as the queries get bigger, this method certainly pays off.

The only downside to this method is that now you have to manage three stored procedures instead of
one. This is not that much of a problem though as the different stored procedures can be considered
one single 'system', so it would be logical to keep all of them together in the same script, which would
be just as easy to edit as a single stored procedure would be. As far as security is concerned, this
method shouldn't give you any extra headaches either, as the delegator is the only stored procedure
directly called by the client, this is the only one you need to manage permissions on. The rest will only
be called by the delegator, which will always work as long as those stored procedures are owned by the
same user as the delegator.




                                                                                                             44
     I've had large successes using this technique. Recently I developed a (partial full-text) search engine for
     our reports database, which resulted in a stored procedure that originally ran about 20 seconds. After
     employing the above technique, the stored procedure only took about 2 seconds to run, resulting in a
     ten-fold increase in performance! [6.5, 7.0, 2000] Contributed by Jeremy van Dijk. Added 8-15-2000

9.   While temporary stored procedures can provide a small performance boost in some circumstances,
     using a lot of temporary stored procedures in your application can actually create contention in
     the system tables and hurt performance. Instead of using temporary stored procedures, you may
     want to consider using the SP_EXECUTESQL stored procedure instead. It provides the same benefits
     on temporary stored procedures, but it does not store data in the systems tables, avoiding the
     contention problems. [7.0, 2000]

10. If you are creating a stored procedure to run in a database other than the Master database, don't
    use the prefix "sp_" in its name. This special prefix is reserved for system stored procedures.
    Although using this prefix will not prevent a user defined stored procedure from working, what it can do
    is to slow down its execution ever so slightly.
    The reason for this is that by default, any stored procedure executed by SQL Server that begins with the
    prefix "sp_", is first attempted to be resolved in the Master database. Since it is not there, time is wasted
    looking for the stored procedure.
          If SQL Server cannot find the stored procedure in the Master database, then it next tries to resolve
    the stored procedure name as if the owner of the object is "dbo". Assuming the stored procedure is in
    the current database, it will then execute. To avoid this unnecessary delay, don't name any of your
    stored procedures with the prefix "sp_". [6.5, 7.0, 2000] Tip contributed by Joey Allen.

11. Before you are done with your stored procedure code, review it for any unused code that you
    may have forgotten to remove while you were making changes, and remove it. Unused code just adds
    unnecessary bloat to your stored procedures. [6.5, 7.0, 2000] Added 8-15-2000

12. For best performance, all objects that are called within the same stored procedure should all be
    owned by the same owner, preferably dbo. If they are not, then SQL Server must perform name
    resolution on the objects if the object names are the same but the owners are different. When this
    happens, SQL Server cannot use a stored procedure "in-memory plan" over, instead, it must re-execute
    the stored procedure, which hinders performance. [7.0, 2000] Added 10-12-2000

13. When you need to execute a string of Transact-SQL, you should use the sp_executesql stored
    procedure instead of the EXECUTE statement. Sp_executesql offers to major advantages over
    EXECUTE. First, it supports parameter substitution, which gives your more options when creating your
    code. Second, it creates query execution plans that are more likely to be reused by SQL Server, which
    in turn reduces overhead on the server, boosting performance.
    Sp_executesql executes a string of Transact-SQL in its own self-contained batch. When it is run, SQL
    Server compiles the code in the string into an execution plan that is separate from the batch that
    contained the sp_executesql and its string.
    Learn more about how to use sp_executesql in the SQL Server Books Online. [7.0, 2000] Added 3-7-
    2001

14. SQL Server will automatically recompile a stored procedure if any of the following happens:
      If you include a WITH RECOMPILE clause in a CREATE PROCEDURE or EXECUTE statement.
      If you run sp_recompile for any table referenced by the stored procedure.
      If any schema changes occur to any of the objects referenced in the stored procedure. This
       includes adding or dropping rules, defaults, and constraints.
      New distribution statistics are generated.
      If you restore a database that includes the stored procedure or any of the objects it references.
      If the stored procedure is aged out of SQL Server's cache.
      An index used by the execution plan of the stored procedure is dropped.
      A major number of INSERTS, UPDATES or DELETES are made to a table referenced by a stored
       procedure.
      The stored procedure includes both DDL (Data Definition Language) and DML (Data Manipulation
       Language) statements, and they are interleaved with each other.
      If the stored procedure performs certain actions on temporary tables.




                                                                                                              45
15. One hidden performance problem of using stored procedures is when a stored procedure
    recompiles too often. Normally, you want a stored procedure to compile once and to be stored in SQL
    Server's cache so that it can be re-used without it having to recompile each time it is used. This is one
    of the major benefits of using stored procedures. But in some cases, a stored procedure is recompiled
    much more often than it needs to be recompiled, hurting your server's performance. In fact, it is possible
    for a stored procedure to have to be recompiled while it is executing! Here are three potential problems
    you want to look out for when writing stored procedures.
    Unnecessary Stored Procedure Recompilations Due to Row Modifications and Automated Statistics
    Update
    If your database has the "Auto Update Statistics" database option turned on, SQL Server will
    periodically automatically update the index statistics. On a busy database, this could happen many
    times each hour. Normally, this is a good thing because the Query Optimizer needs current index
    statistics if it is to make good query plan decisions. One side effect of this is that this also causes any
    stored procedures that reference these tables to be recompiled. Again, this is normal, as you don't want
    a stored procedure to be running an outdated query plan. But again, sometimes stored procedures
    recompile more than they have to. Here are some suggestions on how to reduce some of the
    unnecessary recompilations:
         Use sp_executesql instead of EXECUTE to run Transact-SQL strings in your stored procedures.
         Instead of writing one very large stored procedure, instead break down the stored procedure into
          two or more sub-procedures, and call then from a controlling stored procedure.
         If your stored procedure is using temporary tables, use the KEEP PLAN query hint, which is used
          to stop stored procedure recompilations caused by more than six changes in a temporary table,
          which is the normal behavior. This hint should only be used for stored procedures than access
          temporary tables a lot, but don't make many changes to them. If many changes are made, then
          don't use this hint.
    Unnecessary Stored Procedure Recompilations Due to Mixing DDL and DML Statements in the Same
    Stored Procedure
    If you have a DDL (Data Definition Language) statement in your stored procedure, the stored procedure
    will automatically recompile when it runs across a DML (Data Manipulation Language) statement for the
    first time. And if you intermix both DDL and DML many times in your stored procedure, this will force a
    recompilation every time it happens, hurting performance.
    To prevent unnecessary stored procedure recompilations, you should include all of your DDL
    statements at the first of the stored procedure so they are not intermingled with DML statements.
    Unnecessary Stored Procedure Recompilations Due to Specific Temporary Table Operations
    Improper use of temporary tables in a stored procedure can force them to be recompiled every time the
    stored procedure is run. Here's how to prevent this from happening:
         Any references to temporary tables in your stored procedure should only refer to tables created by
          that stored procedure, not to temporary tables created outside your stored procedure, or in a string
          executed using either the sp_executesql or the EXECUTE statement.
         All of the statements in your stored procedure that include the name of a temporary table should
          appear syntactically after the temporary table.
         The stored procedure should not declare any cursors that refer to a temporary table.
         Any statements in a stored procedure that refer to a temporary table should precede any DROP
          TABLE statement found in the stored procedure.
         The stored procedure should not create temporary tables inside a control-of-flow statement.

16. To find out if your SQL Server is experiencing excessive recompilations of stored procedures, a
    common cause of poor performance, create a trace using Profiler and track the SP:Recompile event. A
    large number of recompilations should be an indicator if you potentially have a problem. Identify which
    stored procedures are causing the problem, and then take correction action (if possible) to reduce or
    eliminate these excessive recompilations. [7.0, 2000] Added 9-13-2001

17. Stored procedures can better boost performance if they are called via Microsoft Transaction
    Server (MTS) instead of being called directly from your application. A stored procedure can be reused
    from the procedure cache only if the connection settings calling the stored procedure are the same. If
    different connections call a stored procedure, SQL Server must load a separate copy of the stored
    procedure for each connection, which somewhat defeats the purpose of stored procedures. But if the
    same connection calls a stored procedure, it can be used over and over from the procedure cache. The
    advantage of Transaction Server is that it reuses connections, which means that stored procedures can
    be reused more often. If you write an application where every user opens their own connection, then
    stored procedures cannot be reused as often, reducing performance. [7.0, 2000] Added 10-12-2000



                                                                                                            46
18. Avoid nesting stored procedures, although it is perfectly legal to do so. Nesting not only makes
    debugging more difficult, it makes it much more difficult to identify and resolve performance-related
    problems. [6.5, 7.0, 2000] Added 10-12-2000

19. If you use input parameters in your stored procedures, you should validate all of them at the
    beginning of your stored procedure. This way, if there is a validation problem and the client
    applications needs to be notified of the problem, it happens before any stored procedure processing
    takes place, preventing wasted effort and boosting performance. [6.5, 7.0, 2000] Added 10-12-2000

20. When calling a stored procedure from your application, it is important that you call it using its
    fully qualified name. Such as:
    exec database_name.dbo.myProcedure
    instead of:
    exec myProcedure
    Why? There are a couple of reasons, one of which relates to performance. First, using fully qualified
    names helps to eliminate any potential confusion about which stored procedure you want to run, helping
    to prevent bugs and other potential problems. But more importantly, doing so allows SQL Server to
    access the stored procedures execution plan more directly, and in turn, speeding up the performance of
    the stored procedure. Yes, the performance boost is very small, but if your server is running tens of
    thousands or more stored procedures every hour, these little time savings can add up. [7.0, 2000]
    Added 3-7-2001 More from Microsoft

21. If a stored procedure needs to return only a single value, and not a recordset, consider returning
    the single value as an output statement. While output statements are generally used for error-
    checking, they can actually be used for any reason you like. Returning a single value as at output
    statement is faster than returning a single value as part of a recordset. [6.5, 7.0, 2000] Added 8-1-2001




                                                                                                            47

				
DOCUMENT INFO
Shared By:
Tags: JAVA, .net, Spring
Stats:
views:56
posted:9/8/2012
language:
pages:47
Description: These documents are relater to .net,sql and specially for JAVA.