Database Programming Standards for Microsoft SQL Server by lq3426


									Company XYZ Database Programming Standards for Microsoft SQL Server

Version 0.9 Microsoft Confidential

Contributors: Manish Sinha, MCS Rob Oikawa, MCS

Document Revision History
Author Manish Rob Date 4/6/1998 9/10/1998 Status Original Document Created Updated and revised

For further information about this document please contact: Manish Sinha, Principal Consultant Microsoft Consulting Services 10260 SW Greenburg Rd, Suite 600 Portland, OR 97223 Tel: (503) 452-6411 Fax: (503) 452-8314 Internet:


Microsoft Consulting Services

Page 2


4 4 4 4 5 5 7 8 9 9 10 10 11 11 11 12 12 12 12 13 13 13



Microsoft Consulting Services

Page 3


1.1. Purpose of the document

The purpose of this document is to document the database programming standards that will be used in the database design for the XYZ company ABC software development project. The same naming conventions (as documented here) can be used in all subsequent database projects. Hence by following this naming convention a standard can be achieved across the board in the XYZ company This could help in achieving the corporate database goal - one common data dictionary in one common naming convention all across the corporation.


Microsoft SQL Server Programming Standards

The use of programming conventions has multiple benefits for a development organization relating to developer’s efficiency and the code’s readability, reusability, and maintainability of that code. These benefits make it well worth the time and effort developers initially spend to learn an enterprise set of coding standards as well as the time invested in enforcing those standards. This document specifically addresses the problem of naming identifiers and database objects within the Microsoft® SQL Server environment. In order for the client to derive these benefits this document's objectives are:  To help programmers (especially in a multi-programmer project like the one being worked on here), standardize the structure, coding style and logic of SQL code required in constructing a SQL Server relational database. To create precise, readable, and unambiguous source code. To be consistent with other language conventions (most importantly, the languages which may be used on this project). To be efficient in creating/replicating and modifying SQL Server databases. To define the minimal requirements necessary to do the above.

   


General Guidelines- Use of Database Design Tools

Every project that involves significant design (or redesign) of databases should take advantage of a database design tool. Any tool used should be able to create, at a minimum, the core physical database from the design. Subsequent changes to the database should be made by modifying the design in the tool and then generating a SQL script that can be run against the existing database, or if the changes are major, recreating the entire database. Examples of such design tools include S-Designer and Erwin. .


Microsoft Consulting Services

Page 4


Save SQL Scripts as ASCII Text

Save all SQL statements as ASCII text. These include the Data Definition Language (DDL) that creates the database, the Data Control Language (DCL) that administers the database, and the Data Manipulation Language (DML) that allows users to manipulate the database. This has several advantages such as allowing:        Creation or replication of the database automatically by loading the text file and executing the script in a SQL administration tool Use of version control software such as Microsoft Visual SourceSafe will be used. Use of a different editor (if required) Use of automated tools, such as Grep Creation of code generation or CASE tools in Microsoft® Visual Basic or Microsoft® C++ for manipulating the database Performance of external analysis of your SQL code Cutting and pasting SQL statements as embedded SQL in applications


Naming Conventions

The following naming conventions are derived from Hungarian notation. Hungarian notation is as valuable in SQL Server as it is in C or Visual Basic. Although the database design indicates the type of a column, these designs are not always accessible to all the developers who might need to write or maintain code or interact with the database. Here are some examples:    ipkCustomer - Represents an index on the primary key column Customer memoCustomer A memo text column Customer vsCustomer A variable length string column Customer

Each of these column names tells a programmer something very different. This information is lost when the object name is reduced to Customer (these example names should be further qualified with a better base description). These variable prefixes are also valuable when transferring data between databases from different vendors an "integer" from one vendor may not be the same as an "integer" from another vendor. Use of the notation can be evaluated on a project by project basis. For example some DBMS do not allow mixed case database object names. These databases would require the use of all uppercase names using underscores between "words" (VS_CUSTOMER) or there may be a need to conform to a naming standard already in place on an existing system. Although the use of Hungarian notation stands on its own merits, Hungarian notation is also widely used by Windows C programmers, constantly referenced in Microsoft product documentation and in industry programming books, and the basis for the client standards and guidelines for C and Visual Basic. Since many developers will contribute to both front-end interface and back-end database teams and to facilitate ongoing communication between all groups, client's SQL coding standards and guidelines are also based on Hungarian notation.


Microsoft Consulting Services

Page 5

Variable and function names have the following structure (in basic Hungarian notation): <prefix><body><qualifier> Part <prefix> <body> <qualifier> Description Describes the use and type of the variable or object. Describes the variable or object. Denotes a derivative of the variable or object. Example spGetRecordNext spGetNameFirst spGetNameLast

Prefixes The following type abbreviations for variables and database objects should be used in addition to the basic variable prefix types presented in other documents:

Abbreviation l w tw d f c tm s

Description integer - = 4 bytes (long in VB and C++) smallint - small integer = 2 bytes (integer in VB word in C++) tinyint - tiny integer = 1byte float - Floating point number = 8 bytes (double in VB and C++) real - SQL real number = 4 bytes (single in VB and float in C++) money - Decimal currency type = 8 bytes (currency in VB) smallmoney - Decimal currency = 4 bytes char(n) - Fixed length string up to 255 characters (fixed length string in VB null terminated fixed length string in C++) varchar(n) - Variable length string up to 255 characters (variable length string in VB) text - A variable length text field generally used to contain strings > 255 characters in length (often called a memo field) binary - Contains fixed byte sized binary quantities up to 255 bytes in length varbinary - Contains variable byte sized binary quantities up to 255 bytes in length






Microsoft Consulting Services

Page 6


image - A variable length binary field generally used to store images > 255 bytes in length (binary large object) datetime - Contains a date and time quantity = 2(4 byte integers) smalldatetime - Contains a date and time quantity = 2(2 byte small integers) bit - Holds either a 1 or 0 = 1 bit timestamp - SQL Server timestamp value. Does not contain a temporal quantity but instead it contains a sequential integer used to order transactions = varbinary(8). sysname - user defined datatype used in system table = varchar(30) user_type_name - user defined datatype

dt sdt b tmstmp

sysn udt


Database Objects

The basic model allows any typed identifier to be named, but the different types of objects provided in a relational database require that relationships and identity be expressed within the naming convention. The following table specifies supplemental naming rules and guidelines for selected database objects. The Required/Optional field indicates whether the naming rule should be considered a mandatory part of the convention or simply a suggested guideline. Depending on the database tools, using all optional naming prefixes keeps object types together when viewing and sorting by object name (some tools do view and sort by object type automatically).



Required/ Optional Required



Table names should reflect the entity types they represent. It is sometimes useful to prefix a table with tbl. Column names should be well formed Hungarian identifiers which represent the type of data to be stored. For readability, avoid underscores, use mixed case instead The database name is a rather arbitrary identifier, however ideally it should somehow provide a indication of the entity group or application it is associated with. Also qualifiers for production, maintenance, test, and development are useful.

tblCustomer customer table.




wWidgetCount - integer count of widgets.



Accounting - Database for use with all accounting programs. AccountingDev Database for development of accounting programs.


Microsoft Consulting Services

Page 7


The name of the default should be prefixed by def. An index name should be formed by taking the name of the column it is to be placed on and appending the prefix "i(keytype)". (keytypes are: pk = primary key, ak = alternate key, or ix if it is not a key column) Prefix stored procedures with and sp. Use mixed case rather than underscores. This separates coded stored procedures from the stored procedures included with the database (prefixed with sp_). As a guideline, it is effective to use the same name-qualifier for the SQL Server procedure and the corresponding application function call. The name of the rule should be prefixed by rule.


defStatus status. (keytype




Required optional)

ipkCustomerID - Index on the primary key column Customer ID.



spAddUser - Would be called from a function called AddUser in the application.



ruleWidgetCount - Rule governing legal counts of widgets. udtwWidgetSize Datatype used to contain widget sizes. The base type prefix w is used. trgdWidgetRef - Delete trigger to check widget references.

User Defined Types

User types should be named to maintain consistency with the type names used in the client application code. Use the prefix udt with the prefix for the base type for building identifiers. Use a prefix of trg to specify a trigger. In order to be able to distinguish the type of trigger, add the type of the trigger to the trg prefix. The following codes should be used: i – insert u – update d – delete






Database views assemble information from multiple tables in a specified format. Views should be used to present information to the users in the appropriate way. For example, when an underlying database uses the naming conventions described above, column/field names such as "wWidgetCount" and "vsCustName" can be aliased into "Widget Count" and "Customer Name". Views can also serve as a security device for the database. Users can be granted access to views appropriate to their user class while the underlying database table is protected or hidden.


Microsoft Consulting Services

Page 8


Coding Style - General SQL Statements

Your SQL statements, whether they are included in scripts or in code, should be formatted so that they are easy to read. The most aesthetically pleasing way to do this is to put your database identifiers in mixed case, but leave your SQL keywords in upper case. The use of mixed case is required in Hungarian notation for visually separating the prefix, body and qualifier in names. Avoid the use of underscores as "word" separators and use mixed case instead. For example use: trgdWidgetRef instead of: trgd_Widget_REF This saves character space allowing your names to be more descriptive and is also easier to read. It is also advantageous to justify your statements and split them across lines to highlight each phrase of the statement. For example: SELECT FROM WHERE ORDER BY Person.vsFirstName, Person.vsLastName, Person.wID Person Person.dtBirthDate - SYSDATETIME > 75 YEARS Person.dtBirthDate DESC (Delete trigger to check widget references.)

This clearly offsets the statement and its individual parts.


DDL Statements

As another example, it is best to put a CREATE TABLE statement on a line by itself, and list the table's columns and their data types on subsequent lines. By carefully formatting the statement, you can make it easy to tell when an identifier exceeds the maximum length imposed by the database management system, and also lay out column constraints. A well-formatted CREATE TABLE statement might look something like this: CREATE TABLE Player ( wPlayerID INTEGER wJerseyNumber INTEGER vsFirstName VARCHAR(30) vsLastName VARCHAR(30) cSalary MONEY dtBirthdate DATETIME )


The use of spaces between the column names and data types, as well as the data types and column constraints, keeps the script very easy to read. It also highlights the features of each script. The datatype keyword can be set a defined number of positions away from the beginning of the column name, this can show any identifier which is too long to fit in the normal column structure for the target database. You can greatly speed this work by using an editor that can handle irregular tab stops. For this example, we could set tabs at the 4, 34, and 43 positions. By keeping only one column definition to each line, we make it easy to reorder existing columns, or delete or insert columns. The placement of the parenthesis also helps this cause. Notice that we always put usersupplied identifiers in mixed case and database keywords in uppercase.

Microsoft Consulting Services

Page 9


Stored Procedure Style

Format CREATE PROCEDURE statements so the declaration of the procedure is visually isolated from the procedure definition. The most convenient way to do this is to simply include a blank line. Carefully indent conditionals and flow-of-control statements, just as in any other programming language. It is a good idea to refrain from using parameter names and local variable names that are not unique to other database object names. For example, if there is a "PartNumber" column in a "Parts" table, don't call the stored procedure's parameters "PartNumber" or "Part". This is an example of a well-formatted CREATE PROCEDURE command for a SQL Server stored procedure: Create Procedure spEnumMajorCode As /* * * * * * * * * */

Name: Inputs: Selection Set: Output Params: Return Value:

spEnumMajorCode None wMajorCode, vsMajorCodeName None None

Retrieve all the Major Codes

SELECT wMajorCode, vsMajorCodeName FROM MajorCode ORDER BY wMajorCode GO Grant Execute on spEnumMajorCode to salesupgroup GO Here, the "AS" keyword is on a line by itself to break the declaration of the procedure from its definition. The use of indenting makes the statement more palatable (parentheses are also useful in more complex operations). This code has a comment to describe the function of the procedure. Additional comments might indicate who calls it and who is in charge of maintaining it. Just as you would for functions in a procedural language, try to make stored procedures as general-purpose as possible without compromising execution efficiency.


Data Manipulation Language

Most of the development work required by an application is done in the data manipulation constructs that SQL provides. This portion of the language allows applications to retrieve, change, or remove data from the database. Because the bulk of a program's execution time is spent with these statements, it is most important to assure that they effective and efficient. The guidelines here are generally aimed at clarity. The

Microsoft Consulting Services

Page 10

few performance suggestions offered pertain more to using the correct statement rather than correctly optimizing a statement.



When coding a join, it is typically most important to assure the logic of the statement is correct. Beware of border-line conditions which can invalidate the conditions or the statement . As joins become more and more complicated, it becomes less clear where columns come from. As a result, the most important part of writing a readable join is to qualify column references with a table name. There is no need to require table qualification for every column, but sparing use can make statements more readily understandable without referring back to the database design models or the database schema. For example: SELECT Customer.vsName, Customer.vsAddress FROM Customer, Salesman WHERE Customer.wRepresentative = Salesman.wSalesmanID AND Customer.vsContact = 'Gottfried, Gilbert' ORDER BY Customer.vsName



SQL Server often performs better with subqueries than with joins (this may not be true for other DBMSs). Subqueries can also be much easier to follow understand than joins. However, if performance is poor on a subquery consider trying to code a join instead. If the subquery limiting clause includes "IN" or "NOT IN", it is possible that a join would be better. Many subqueries can also be implemented by joining a table against itself. When a subquery is used, be sure you're using the appropriate type; correlated subqueries (: repeating subqueries that depend on the outer query for values) and uncorrelated subqueries are extremely different, both in result set and in performance. Remember that a subquery may cause records to be locked, just like any other data manipulation statement. When coding a subquery, carefully indicate which columns the subquery will reference. It is best to do this by using aliases in the major select, because this can be practiced with both correlated and uncorrelated subquery.


The WHERE Clause

The WHERE clause of a SELECT statement contains much of a SQL command's power. Crafting a logically correct and efficient WHERE clause should be a developer's first priority. When using complex boolean expressions, for example, it is important to indent subordinate expressions and use parenthesis to force a given order of evaluation. For example: WHERE (Status = 'CANCELED' OR Status = 'INVALID') AND VALUE > 35000


Microsoft Consulting Services

Page 11

Never assume that a database may evaluate operators with a certain precedence because this is not defined in the ANSI SQL standard.



When using the ORDER BY or GROUP BY operators, refrain from using column numbers to indicate the operative columns (although this is necessary for UNION operations). This feature is not implemented by ANSI Level One database systems. SQL syntax requires that you place the GROUP BY and ORDER BY clauses at the end of your statement, after the WHERE clause. The HAVING keyword should appear after a GROUP BY. Note it is sometimes possible to use HAVING without a GROUP BY, so any code which generates dynamic SQL should be coded to handle this condition, if it is possible for the user to request it. Because this situation can produce inconsistent results it should be avoided.


INSERT Statements

The SQL INSERT statement puts new data into a table. You should always provide a column list for your INSERT statement, and never rely on the proper ordering of columns by the database. Be careful when performing an insert into a table from which you're doing a subselect in the same statement; not all database systems support this in the same way.


Working with Database Scripts

Database scripts must be carefully maintained and planned so that they are available for the greatest number of uses. All sites that have an instance of the database need the scripts to install the database, generate the tables and indexes, and possibly load data (it is possible to do this remotely, but scripts still help to automate the process). They are also useful for generating test and maintenance databases in parallel with the development or production databases. Always clearly comment scripts to indicate revision levels and change notes. You should only use scripts created by database management tools when it is inappropriate or impossible to create your own.


Types of Scripts

Always include the commands to create database objects, such as tables, indexes, and triggers, in a script separate from your data scripts. This gives you the greatest flexibility in deciding which scripts and data to install at any given point in time. It is also useful to separate index creates from table creates. This enables the creation of the tables, and inserting of data without built indexes, which is faster (especially when doing bulk loads). Keeping index creation scripts separate, also allows indexes to be tuned during development and as the database changes. Triggers, Rules, and Defaults can also be separated to allow easy changes independent of creating the database tables.


Microsoft Consulting Services

Page 12

Insertion scripts should do exactly that; insert data. They should not create other database objects, except when it is necessary to have a temporary table for the preparation of insert data. If the script does this, it should delete the table and its indexes before terminating. Insertion scripts also should not delete existing data. If this capability is needed, develop a deletion script to compliment each insertion script as required. When required a database installation script can be created. These should be logically laid out and carefully commented. Enhance error checking and handling as much as possible to make the database installation as smooth as possible. Consider how storage and partitioning parameters which may appear throughout the scripts might need to be changed. Ensure that any separated Index, Trigger, Rule, or Default scripts are included. It is often convenient to provide scripts for the database which customize the database or provide different runtime options in the system. Each of these functions should be accommodated by separate "option" or "patch" scripts, which can optionally run after executing the complete suite of stock scripts.


Other Issues - Embedded SQL and Precompilers

Use of embedded SQL is not recommended at the client level. The recommended approach is to use call level APIs such as ODBC or DBLIB. However, if it is necessary to use embedded SQL in a language like C, follow the guidelines laid forth by the designer of the precompiler. This assures your code will be efficiently and accurately processed by the precompiler, resulting in a minimum of bugs to resolve. Precompiler directives should be used in a clear format. Remember that most third-generation languages are not sensitive to white space, so concentrate on formatting code for the humans that will read it, not the machines.


Dynamic SQL

Dynamic SQL is an important asset of most modern databases. Its use offers a great deal of power and flexibility. However, because the SQL statement is not clearly visible before run time, code readability is compromised. As a result, it is more important than ever to clearly comment code. Include samples of the SQL that can be generated by different functions in comments, and explain why each one is used. What parts of the statement can change? Where do they come from? Answers to these questions are extremely important to maintenance and support teams, and should be readily available from source code comments or technical program documentation.


Use of a Data Access Layer

Many database applications at client will use high level languages like Microsoft Visual Basic as the front end to the SQL engine. It is important to carefully design the application so that the dividing line between SQL/database code and the application code is very well defined. This again aids maintenance and support teams, who will need to identify the affected sections of code when a database change is required.


Microsoft Consulting Services

Page 13

To top