Deployment
Document Sample


Deploying and Maintaining Microsoft SQL Server 7.0
in an In-Store Environment
Contents
Introduction ............................................................................................................................................. 4
Infrastructure .......................................................................................................................................... 5
Network Infrastructure ....................................................................................................................... 5
Network Architectures and Protocols ......................................................................................... 5
Windows NT Domain Models .................................................................................................... 6
Single Domain Model .......................................................................................................... 6
Master Domain Model ......................................................................................................... 6
Multiple Master Domain Model .......................................................................................... 6
Complete Trust Domain Model ........................................................................................... 6
No Domain (Workgroup) Model ......................................................................................... 7
Domain Models with a BDC in Each Store ......................................................................... 7
Recommended Domain Model ............................................................................................ 7
Hardware Infrastructure ..................................................................................................................... 8
Configuration .......................................................................................................................................... 9
Character Sets and Sort Order ..................................................................................................... 9
master and tempdb Databases ..................................................................................................... 9
Database Growth Parameters .................................................................................................... 10
Logging ..................................................................................................................................... 10
SQL Server Configuration Parameters ...................................................................................... 11
Implementation ...................................................................................................................................... 12
Deployment ............................................................................................................................................ 13
Remote Access Service ............................................................................................................. 13
Remote Control Software .......................................................................................................... 14
Software Distribution and Management Software..................................................................... 14
Recommendations ..................................................................................................................... 14
Operations.............................................................................................................................................. 16
Remote Administration .................................................................................................................... 16
Jobs .................................................................................................................................... 16
Operators ........................................................................................................................... 16
Alerts ................................................................................................................................. 16
Recommended Centralized Administration Scenario ......................................................... 17
Backup and Recovery ...................................................................................................................... 17
Full Database Backup ........................................................................................................ 17
Differential Backup ............................................................................................................ 17
Transaction Log Backup .................................................................................................... 18
Recommended Backup Scenario........................................................................................ 18
Recommended Recovery Scenario..................................................................................... 18
Database Maintenance ..................................................................................................................... 18
2
Polling ..................................................................................................................................................... 20
Snapshot Replication................................................................................................................. 20
Merge Replication ..................................................................................................................... 20
Transactional Replication ......................................................................................................... 21
Replication Implementation Details .......................................................................................... 21
Recommended Replication Techniques .................................................................................... 22
Connected Stores ............................................................................................................... 22
Disconnected or Occasionally Connected Stores ............................................................... 22
Pros and Cons for Replication as a Polling Solution ................................................................. 22
Pros .................................................................................................................................... 22
Cons ................................................................................................................................... 23
bcp Utility ................................................................................................................................. 23
Data Transformation Services (DTS) ........................................................................................ 24
Two-Phase Commit ................................................................................................................... 24
Troubleshooting..................................................................................................................................... 25
Support Infrastructure ............................................................................................................... 25
Automated Alerts ...................................................................................................................... 25
Remote Monitoring ................................................................................................................... 26
Licensing ................................................................................................................................................ 27
Conclusion.............................................................................................................................................. 29
Finding More Information ................................................................................................................... 31
3
Introduction
Traditionally, in-store systems have been developed using proprietary, nonrelational database
platforms. With the advent of strong competition in the retail industry, companies are forced to
focus on increasing efficiencies within the store. These efficiencies translate directly into better
customer service, greater throughput of customers, greater revenue, and increased availability
of data for the corporate merchandising departments.
The database platform plays a critical role in accomplishing efficiencies through the use of
technology within the in-store environment. The database needs to provide strong online
transaction processing (OLTP) performance in a concurrent environment to handle the
transactions during the day, and also strong decision support performance for end-of-day and
reporting activities. In addition, the ability to automate and customize the transfer of data to
corporate warehouses is mandatory.
Price/Performance ratios are also important factors in selecting an in-store database platform.
Organizations need to consider the number of licenses required for supporting all of their
stores and the performance required for achieving all of their business goals. Perhaps the most
critical characteristic of an ideal in-store database platform is its ability to support automated
and remote administration because each outlet in most organizations will not be staffed with
technically oriented employees. Microsoft SQL Server version 7.0 delivers in all of these
areas. It is a compelling choice for an in-store database platform.
This document is designed to provide the development and implementation teams of a retail
organization with the information needed to deploy and maintain SQL Server 7.0 successfully
in an in-store environment. General recommendations for the optimal solution are provided
wherever applicable and various options are discussed in each section, along with the
respective pros and cons. If any of the recommendations conflict with the specific business
needs of an organization, they should be overridden for a more appropriate choice.
4
Infrastructure
Before SQL Server 7.0 is deployed, the network and hardware infrastructure within the store
should be examined. This topic discusses the various options available within each
infrastructure and provides some general recommendations. For the purposes of this topic and
the rest of the document, it is assumed that the Microsoft Windows NT version 4.0 operating
system is the operating system used on both the server and the client.
Network Infrastructure
The issues discussed in this topic include network architecture and protocols, and the
Windows NT domain model.
Network Architectures and Protocols
Ethernet is the most popular LAN architecture in use today and is a good choice for an in-
store LAN because it provides good performance with low startup and administrative
costs.
Fast Ethernet implemented on Unshielded Twisted Pair Category 5 cabling is a good
solution for increasing in-store performance significantly (to 100 Mbps) at a reasonable
cost.
Analyze the network for points of failure. Fault tolerance should be built into the
architecture to eliminate single failure points whenever accommodated by the budget.
All in-store communication, as well as Internet/WAN access, should use the TCP/IP
protocol. This protocol is supported on most operating systems and is considered a
universal standard for internetwork connectivity. It also allows the in-store computers to
support Microsoft Internet Information Services (IIS), HTTP, FTP, SMTP, and streaming
media applications.
Static IP addressing is preferred over Dynamic Host Configuration Protocol (or DHCP)
for the distribution of IP addresses to devices in a small in-store environment because this
reduces the administration workload significantly. However, in-store environments with a
larger number of in-store devices would prefer using DHCP for distribution of IP
addresses to static IP addressing due to ease of administration. Windows Internet Name
Service (WINS) and Domain Name Service (DNS) should be implemented to provide
name resolution services.
In general, standardize on as few network protocols as possible within the store in order to
maximize communication efficiency.
5
Windows NT Domain Models
Selecting the proper domain model is critical to the overall network performance of the
organization. This topic discusses the issues involved in implementing Windows NT Server
version 4.x domains. Microsoft Windows 2000 Server should scale much higher than the
limits specified in supporting larger domains. Furthermore, the limits specified in this topic
represent recommended limits as opposed to hard and fast rules for implementation.
Single Domain Model
Consists of one domain for the entire organization. While this centralizes account and
resource management, thus maximizing ease of administration, it limits the number of
accounts to approximately 40,000 (the recommended Windows NT 4.x domain limit).
Requires the domain controllers to have additional processing power due to the increased
workload.
Requires the least amount of administration among all of the domain models because there
are no trusts to be maintained.
Master Domain Model
User accounts are managed centrally in a master domain while decentralized resource
domains manage the resources. The number of accounts is limited to approximately
40,000.
Domain controller in the master domain will need additional processing power.
Each resource trusts the master domain (one way trust); therefore, administration cost is
more than the single domain model, but less than the other domain models.
Multiple Master Domain Model
Similar to the master domain model except that multiple master domains are allowed.
Therefore, this model supports a virtually unlimited number of accounts.
Domain controllers in the multiple master domains share the workload of managing the
accounts.
More trusts to manage than the master domain model; therefore, higher centralized
administration costs are incurred.
Complete Trust Domain Model
Decentralized account and resource management. Supports an unlimited number of users.
Unlike the other three models, domain specific account policies are supported.
Highest administration costs due to the large number of trusts involved. This is the least
desirable model in an enterprise environment and is suited more for a departmental
implementation.
No performance penalties are involved due to localized authentication for accounts and
resources.
6
No Domain (Workgroup) Model
There are no domains; each store has its own workgroup instead.
Decentralized account and resource management. Supports an unlimited number of users.
Corporate administrators have to log on to each workgroup individually to apply changes;
therefore, the administration costs can be high if all the workgroups need the same
changes.
No performance penalties are involved due to localized authentication for accounts and
resources.
Domain Models with a BDC in Each Store
In order to avoid the performance penalty of incurring a WAN round trip for each
authentication request, the domain models listed earlier require the configuration of the in-
store server as a Backup Domain Controller (BDC).
Configuring the in-store server as a BDC adds some overhead (based on the size of the
account database) because it needs to synchronize the security accounts database with the
Primary Domain Controller (PDC) on a regular basis across a WAN or dial-up link.
Configuring each in-store server as a BDC does not support disconnected or occasionally
connected stores well. Upon restoration of the link to corporate, the workload on the in-
store servers can be large because the accounts database has to be synchronized in addition
to sending transactions and receiving lookup table information. Performance could be
affected on slow dial-up links.
Recommended Domain Model
The workgroup model (or the no domain model) is the preferred choice for in-store
environments that involve only a small number of client devices, primarily due to its low
cost and decreased maintenance complexity.
This model does incur high administration costs when changes have to be distributed
throughout all the stores. It is imperative that the account and resource setup be planned
prior to deploying workgroups to the stores so that the number of future changes is
minimized. Generic logins should be used to anticipate future in-store usage because the
primary user-based security will be handled by the in-store front-end application.
Except for the cases where changes need to be distributed to all the stores, the no domain
model incurs extremely low administration costs because there are no trusts to manage and
each store operates within its own workgroup.
In organizations with a larger number of in-store devices or a dynamic account list, a
workgroup model may increase administration costs; hence, a domain model is preferred.
In these cases, the single/multi master domain model provides the centralized account
management and decentralized resource management best suited for such an in-store
environment. The master domain would reside in the corporate location, and each store
would contain a server configured as a backup domain controller.
7
Hardware Infrastructure
Memory and processor speed on the server should be maximized based on the budget.
They also depend on the number and nature of applications running on the in-store server.
If the budget accommodates it, implement hardware (or software) RAID Level 1 (disk
mirroring) or RAID Level 5 (disk striping with parity) to prevent single point of failure in
the disk drives, and to maximize the performance of the system. Mirroring provides better
write performance (which is critical for OLTP applications) whereas disk striping provides
better read performance (which is critical for Decision Support Systems (DSS)
applications). However, striping requires a minimum of three drives to implement versus
two for mirroring. In general, mirroring is the preferred fault tolerance solution for an in-
store system. Although implementing RAID is highly recommended, it is not required for
SQL Server and it may not be practical for many retailers because it adds to the per-store
costs significantly.
Invest in the fastest disk drives supported by the server because they are always the
bottleneck in any computer configuration. The different versions of the SCSI interface
standard (SCSI, SCSI-2, FAST SCSI-2, WIDE SCSI-2) provide superior data transfer
performance compared to the disk drives using the IDE interface. FAST SCSI-2 is the
most popular choice on modern servers, but may have a higher cost associated with it
compared to the IDE drives.
For more information about hardware configurations for SQL Server computers, see the
Compaq white papers listed in “Finding More Information.”
8
Configuration
There are several configuration options that need to be set in SQL Server 7.0 during and after
the installation process in order to maximize the database performance. This section details the
configuration options and their respective settings that maximize performance and minimize
ongoing administration.
Character Sets and Sort Order
Character sets, also known as code pages, control the way character-based data is stored
internally in SQL Server. Each character set consists of 256 characters (128 standard printable
characters followed by 128 language-specific letters and symbols). Sort orders control the
collation and presentation of data in response to database queries. A sort order determines how
character data is compared, and in what sequence it is returned, from a query. Both of these
configurations are determined during the installation process.
Alternately, the Unicode standard for character representation handles 65,536 characters and
covers virtually every character in use today in the different alphabet sets. This representation
takes up twice the storage as a character, but provides support for storage of different alphabets
such as the European alphabet or Japanese Kanji. Generally, language independence can be
provided by the front-end application and the database should continue to store the data in a
single character set representation. The Unicode standard should be used if the application
involves storage of internationalized data. If the Unicode standard is selected, a Unicode
Collation method must also be provided during the installation process.
Character set should be used, instead of Unicode standard, as the data representation
method to optimize storage and performance, provided there are no requirements to store
internationalized data. The Unicode standard should be used if such requirements exist.
Standardize on the same character set and sort order throughout all of the SQL Server
installations in the organization in order to facilitate direct data transfer between servers.
The default character set (1252 ISO or ANSI) and sort order (case-insensitive dictionary
sort order) work well and should be used unless they conflict with specific business needs.
Do not use character sets to adhere to the language requirements; that should be controlled
by the front-end applications and not by the database.
master and tempdb Databases
The master database is a system database that stores information about all of the other
databases, as well as system configurations and all login accounts. The tempdb database is a
system database that provides temporary storage for user transactions.
During installation, the size of the master database defaults to 7.5 MB and the size of the
tempdb database defaults to 8 MB. If the number of user databases and login accounts is
limited, the default size for the master database should be sufficient. It is a good idea,
however, to increase the size of the master database to account for future growth and avoid
the expansion penalty (explained in the next topic).
9
If the application uses temporary tables, either by explicitly creating them or by using the
ORDER BY and GROUP BY clauses on large result sets, the initial size of tempdb should
be increased appropriately to avoid the expansion penalty. The average amount of space
used by each tempdb operation, multiplied by the average number of concurrent users, is a
good approximation of the ideal size of tempdb. Calculate the space used in a tempdb
operation by multiplying the number of rows being loaded into temporary tables by the
record size.
Given the limited amount of RAM available on the in-store servers, and the limited
number of users in an in-store environment, it is best to avoid creating tempdb in RAM.
RAM is best used by assigning it to SQL Server and using it primarily for data and
procedure caching.
Database Growth Parameters
SQL Server 7.0 automatically expands the size of a database as required during operation.
Either a percentage of the existing size or a fixed number of megabytes, as specified by the
administrator, controls the amount of expansion. While this feature significantly reduces
ongoing maintenance, it creates a performance penalty if the database is expanded frequently
Specify a larger initial database size instead of accepting the default (1 MB) when creating the
user database. Approximate the size based on minimum anticipated amount of data storage.
After the initial size of the database is filled, specify a number of megabytes (or a
percentage) by which to expand. Do not accept the defaults (10 percent of the existing
size, or 10 MB). The reasons for the initial size filling up must be analyzed prior to
making this decision.
Logging
In-store systems require up-to-the-minute recovery, for example, until the last committed
transaction. This functionality is provided in SQL Server 7.0 through transaction logging.
The database configuration option truncate log on checkpoint should be set to False so
that the transactions can be retained in the log file, which can then be backed up
periodically.
The transaction log should be backed up periodically to facilitate better recovery options.
Furthermore, if any nonlogged operations such as SELECT INTO or bcp are performed
on the database, differential backups must be done immediately to protect those
transactions. For information about the preferred backup strategy, see “Operations.”
The initial size of the transaction log file should be approximated using the number of
transactions estimated between each time the log file is backed up. This avoids the penalty
associated with constantly expanding the log file when the default of 1 MB is accepted.
10
SQL Server Configuration Parameters
SQL Server 7.0 allows users to determine the values for a number of configuration parameters
that drive its overall performance. Although this level of tuning may be required for
installations involving Very Large Databases (VLDs) running on high-end servers with large
disk subsystems, most in-store installations do not require it. The default parameter values
work well for in-store databases and should be maintained on all of the store servers. For more
information about setting configuration parameters, see the white paper titled “Microsoft
SQL Server 7.0 Performance Tuning Guide” at
http://msdn.microsoft.com/developer/sqlserver/sql7perftune.htm.
max async IO: The default value of 32 is sufficient for in-store systems. The overall goal
is to control the speed of “checkpointing” the database, for example, writing modified data
or log pages from the buffer cache to the disk subsystem and also truncating the inactive
portion of the log.
max worker threads: The default value of 255 is sufficient for in-store systems. This
controls the number of Windows threads used by SQL Server to service the commands
being submitted to the database at a given time.
Boost SQL Server priority on Windows NT: This allows SQL Server to run at a higher
priority than other Windows NT processes and should be used only if the computer is
dedicated to SQL Server. The default value is zero, which forces SQL Server to run at the
same priority as the other processes.
In general, in-store servers should take advantage of the autotuning and autoconfiguration
features built into SQL Server 7.0 and not change the default configuration parameter
values.
11
Implementation
This topic outlines the database design and implementation issues that affect performance,
administration, and the ability to troubleshoot from a remote location. These are general
recommendations and should not serve to override any conflicting business requirements.
Implement all data integrity checks on the database in addition to performing the same
checks in the front-end code.
Reduce the size of transactions as well as the amount of I/O whenever possible. This
increases performance and concurrency support.
Transactions that affect a large number of rows on a table that contains a complex trigger
could perform poorly in a concurrent environment due to the length and number of locks
held. Isolating the logic contained in the trigger into a stored procedure could improve
performance. In general, transactions should be made as compact as possible in order to
facilitate good performance in a concurrent environment.
Use constraints and Null/Not Null column properties to perform data validation instead of
using triggers.
SQL Server 7.0 has a new index storage mechanism, which does not significantly add to
the overhead involved in maintaining additional indexes. When in doubt, index the
column.
Use the power of stored procedures to reduce the number of client requests submitted to
the server.
Balance performance with scalability in the analysis for selecting the appropriate cursor.
Server-side cursors will generally provide better performance for a given query, but they
are not as scalable as client-side cursors for supporting a large number of clients. The
appropriate cursor should be selected based on the application needs and scalability
requirements.
For more information about performance tuning for SQL Server 7.0 applications, see:
http://msdn.microsoft.com/developer/sqlserver/sql7perftune.htm
12
Deployment
This topic discusses the issues involved in the deployment of SQL Server 7.0 on the in-store
computers. This includes the RDBMS installation on the in-store server and the client utilities
on the in-store clients. It is assumed that the initial installation and configuration of all
computers will occur in a centralized site using “ghosting” or other disk-copying methods. This
topic details the options for distributing updates (such as service packs, or bug fixes) to the in-
store devices for connected, as well as disconnected (or occasionally connected) stores.
Costs and ease of administration are the two overriding factors taken into consideration for the
recommendations provided in this section. In addition, the solutions must be applicable for
both permanent WAN links as well as standard dial-up links. The fact that the distribution
method selected for SQL Server will also be applicable for other software updates to the same
devices is also central to these recommendations. These are general recommendations and
should not serve to override any conflicting business requirements.
Remote Access Service
Remote Access Service (RAS) is built-in software provided with Windows NT that
facilitates dial-in access to the network on which the target Windows NT device resides.
Assuming that the bandwidth of the dial-in connection is sufficient, this is a solid, no-cost
solution for performing file transfers.
Besides support for dial-in access, RAS also provides support for using the Point-to-Point
Tunneling Protocol (PPTP) on top of a Point-to-Point Protocol (PPP) connection to the
Internet. This allows for a secure simulation of a LAN connection over an encrypted
public link, and also provides significant cost benefits. The costs are now reduced to a
fixed monthly fee paid to an ISP for Internet access. This can be significantly less
expensive than dialing long-distance to and from the store on a regular basis.
RAS provides access in the event of a loss of network connectivity and can be used to
provide connectivity in a cost-effective manner to stores located in remote areas where
network connectivity may not be feasible. However, this solution involves dialing in to
each store server individually, thereby increasing administration costs if updates need to
be distributed throughout a large retail chain.
Another option is to develop a custom application that incorporates RAS for disconnected
stores, and automates the task of connecting to each store and performing the necessary
transfers. The application can be designed to provide detailed session logs for each
connection, which can be monitored from a centralized location. The custom application
can also be designed to execute applications, update registry settings, and perform any
other programmable actions on the in-store server. This option can be more efficient than
packaged software solutions because the custom application will be designed to satisfy a
narrow, specific set of requirements.
13
Remote Control Software
Third-party tools, such as PCAnywhere, provide the ability to take control of target
computers and can be used to obtain access to an in-store computer. Basically, each
keystroke or mouse click on the caller’s computer is transferred and executed on the target
computer. Each resulting screen on the target computer is then transferred back to the
caller.
Although both network and dial-in connections are supported, this is a slow-speed solution
that requires significant bandwidth in order to be used as a deployment mechanism. This
solution is better suited for on-demand scenarios for one computer at a time, such as
troubleshooting or helpdesk monitoring scenarios.
Software Distribution and Management Software
This category of tools provides powerful software distribution and management
capabilities for the enterprise. These tools allow for centralized and automated
administration of tasks such as file transfers, remote program execution, and service pack
and patch updates; however, they cost more than both remote control and remote access
solutions.
While administration is easy and automated, it requires staff who are knowledgeable in the
chosen tool to be available at the centralized location.
Microsoft Systems Management Server is a good candidate for desktop management and
software distribution in a Windows NT environment. It can also be used for automated file
transfers, but it requires the implementation of a domain model instead of the simpler
workgroup model.
Sterling Commerce’s Connect:Remote is one of the leading software tools in use by large
retailers to solve the software distribution and management problem. It performs well over
a WAN connection and includes many sophisticated features, such as session logging and
remote program execution, along with the speed required for supporting an enterprise-
wide deployment effort.
Using a combination of SMS for desktop management, Connect:Remote for other tasks
such as polling, and remote program execution is also an option chosen by larger retail
organizations.
Recommendations
Packaged software such as Connect:Remote (or SMS in combination with
Connect:Remote) provides a scalable and reliable solution for tasks such as software
distribution, desktop management, polling, and automating bi-directional file transfers.
These third-party solutions have an advantage over custom RAS solutions in the areas of
scalability and reliability across a large enterprise-wide deployment. Many of
Connect:Remote’s sophisticated logging and administration features can be used, in
addition to its scalability and reliability, to justify the additional per-store costs.
Third-party solutions involve higher total costs, but are better suited for scaling in size and
complexity. Besides the software purchasing costs, they also involve additional
operational training and administration costs. Adding simple scripts to perform additional
tasks can also customize these solutions.
These third-party solutions will support both permanent WAN links as well as standard
14
dial-up links.
If the deployment needs are simplistic in size and complexity, and if the appropriate
development skill sets are available, the custom application solution of incorporating RAS
for disconnected stores is a flexible solution for the lowest cost. The application must
include features such as session logging and the ability to execute programs, updating
registry settings, and performing multithreaded or simultaneous operations on the in-store
server. In addition, the application must have a comprehensive user interface for the
centralized support personnel to monitor the status of all activities.
This solution involves minimal administration and training costs and can be customized to
the organization’s specific business needs. It adds significantly to the software
development and maintenance costs, and is limited in terms of scalability.
Using a PPTP connection on top of a PPP connection to the Internet should be explored as
an alternative to dialing in directly to or from the store. Based on usage patterns, this can
provide significant cost savings.
Remote control tools should be deployed at each store and used strictly for
troubleshooting and monitoring purposes. This greatly increases the ability to provide
remote administration and support, which is one of the critical requirements for any in-
store system.
15
Operations
This topic discusses the operational requirements for SQL Server 7.0 in an in-store
environment, including issues such as remote administration, backup and recovery, and
database maintenance.
Remote Administration
SQL Server 7.0 provides remote administration capabilities by using a master server to store
job definitions and target servers to execute them. The master server is usually defined at the
central administration site, while the target servers are located at each of the remote sites. The
target servers periodically connect to the master server to receive new task lists, and then
disconnect until after execution, when the job status information is relayed back to the master
server. In this manner, jobs are defined once and executed at multiple remote sites with the
status of each execution being captured back at the master server.
Jobs, Operators, and Alerts are the primary components of the administration function in
SQL Server 7.0. There are wizards available to define each of these components and they all
require the SQLServerAgent service to be running in order to execute.
Jobs
Consist of steps, which can include executable programs, Transact-SQL scripts,
Windows NT commands, Active Scripts, or replication agents.
Can be categorized to provide for easy filtering and grouping.
Allow control of sequence of steps with full error handling between each step.
Can be executed locally (standalone jobs) or on multiple remote servers multiserver
administration jobs).
Are executed by schedules and alerts.
Operators
One or more individuals responsible for maintaining the SQL Server databases.
Are notified of job status using e-mail, a pager, or a net send command.
Alerts
Consist of one or more SQL Server events and the response to each of those events.
Responses to events can occur in the form of operator notifications, job executions, raising
an SNMP trap, or forwarding an event to another server.
16
Recommended Centralized Administration Scenario
Configure a server at the corporate site to serve as the master server. Enlist the server at
each store site to serve as target servers.
Define primary and secondary operators at the centralized site who will be responsible for
responding to alerts.
Define the multiserver administration jobs (such as backup jobs or database maintenance
jobs) on the master server. Categorize each job appropriately.
Define On Success and On Failure actions for each step in the job. If an operator needs to
be alerted on failure, execute a RAISERROR WITH LOG statement to write the error
message and the severity to the Windows NT application log.
Define alerts to notify operators, using e-mail or a pager, based on the errors written to the
Windows NT application log. Alternatively, alerts can be defined to process jobs based on
certain errors.
Schedule the jobs on the respective target servers. All maintenance jobs can then be
automated and managed from a centralized location.
Support other tasks such as polling and software distribution from corporate to the stores.
In general, all of the SQL Server tools can be run from a remote location. Therefore, with
the correct security privilege, almost any task performed on the SQL Server host can be
performed from a remote location.
Backup and Recovery
In most situations, providing the ability to recover until the last committed transaction should
be a mandatory requirement for in-store systems. This can be accomplished in SQL Server 7.0
by using a combination of full database backups, differential backups, and transaction log
backups. SQL Server 7.0 backups can be performed while the database is live, and most
Transact-SQL statements are allowed during the backup operation.
It is good practice to store backups on an external media such as high-speed tape drives, and
also to use off-site storage for disaster recovery purposes. A more cost-effective alternative to
tape drives is to use the hard drives of other in-store devices to store the backup files. This can
provide significant cost savings across all stores and still provide acceptable performance and
recovery options
Full Database Backup
As the name implies, a snapshot of the entire database is saved. This can take a long time,
based on the database size. Every recovery must begin with restoring a full database backup.
Therefore, performing this backup periodically is recommended.
Differential Backup
Only the data changes made since the last full database backup are saved. This operation takes
a lot less time than the full database backup, and is generally performed more frequently.
17
Transaction Log Backup
All of the committed transactions since the last backup (full, differential, or transaction log
backup) are saved and then cleared from the log file. This operation takes very little time and
its frequency depends on the tolerance for data loss as dictated by business needs.
Recommended Backup Scenario
1. Perform a full database backup every night to a fresh tape (or to a new file on a hard drive).
2. Perform differential database backups every 2 hours, and immediately after any nonlogged
operation (such as SELECT INTO or bcp).
3. Perform transaction log backups every 15 minutes.
4. Backup jobs can be automated and managed from a centralized location. For more
information about automating and managing backup and other maintenance jobs, see
“Recommended Centralized Administration Scenario.”
Recommended Recovery Scenario
1. Back up the current active transaction log from the server’s hard drive to achieve up-to-the-
minute recovery. If it is not accessible, the database can be restored only to the state
represented by the last saved transaction log (which should be, at most, 15 minutes ago).
2. Restore the last full database backup.
3. Restore the last differential backup created since the last full database backup.
4. Restore each transaction log backup created since the last differential backup.
5. If available, restore the active transaction log backup created in Step 1.
6. These recovery steps can be implemented as a script and executed by an operator on
demand. For more information about managing jobs, see “Recommended Centralized
Administration Scenario.”
Database Maintenance
SQL Server 7.0 significantly reduces the need for frequent database maintenance routines and
provides wizards that automate the definition, execution, and monitoring of these tasks.
Although not required, these tasks can be scheduled periodically to provide peace of mind and
increase the comfort level of the corporate administrators. In addition, integrity checks should
be performed:
When integrity errors appear in the error log.
Prior to a major operating system or SQL Server upgrade.
Prior to a major hardware upgrade or repair operation.
18
In general, the following maintenance tasks can be scheduled using the Database Maintenance
Plan Wizard:
Update statistics used by the query optimizer (with a sampling size of 10 percent). This is
optional because it is executed automatically by default in SQL Server 7.0. However, if
desired, it can be performed on a biweekly basis or as frequently as necessary depending
on the number of transactions processed each day. This operation should be very fast for
most in-store databases and can be performed even when the database is online.
Perform internal data integrity checks. This can be scheduled to execute once a month
during off-peak hours.
Perform database integrity checks. This can be scheduled to execute once a month during
off-peak hours.
Using the wizard, you can also create text or HTML reports from the execution of each task. It
is then possible to store the historical job status information on a remote server, for example,
the master server at corporate.
19
Polling
This topic discusses the implementation issues involved in a common in-store task, namely,
polling. This is the process of feeding selected data, termed an article, from the stores to the
corporate databases periodically, as well as receiving articles from corporate that need to be
loaded in the in-store database. Reliability and the efficient use of bandwidth are the most
critical factors in designing a polling solution. In addition, the ability of the solution to
integrate into the existing software distribution solution (such as SMS or Connect:Remote) is
also explored in this section. This provides the advantage of simplifying the deployment and
support effort and reducing administrative costs by using a single tool to perform both software
distribution and polling.
Replication, the bcp utility, Data Transformation Services (DTS), and the two-phase commit
(2PC) protocol are explored in this section as possible solutions for polling.
Replication uses a publish-and-subscribe method for performing the data transfer. A
distribution database is used for persisting the data to be replicated. SQL Server 7.0 supports
bidirectional replication to both homogeneous as well as heterogeneous database platforms.
Replication allows for graceful handling of network failures because the data will be held at the
distribution database, and transferred once the connection is restored. Thus, replication
provides guaranteed delivery in an automated manner. Furthermore, the SQL Server 7.0
replication architecture is scalable up to hundreds of publishers (replication data sources) and
thousands of subscribers (replication data receivers). There are three major types of replication
provided in SQL Server 7.0.
Snapshot Replication
A current snapshot of the selected data, or article, is transferred from the Publisher to each
Subscriber.
This should be used primarily to synchronize publishers and subscribers if a large number
of changes have occurred since the last synchronization. This replication method should
also be limited to synchronizing a few tables at a time, not the entire database. It is
applicable only for read-only subscribers.
This method can require much bandwidth depending on the size of the article being
replicated.
Snapshot replication provides latent transactional consistency, and provides a good degree
of site autonomy if the subscriber does not update the data.
Merge Replication
Both the publisher and the subscriber are allowed to change the data contained in the
article and the two versions are synchronized using merge replication. Administrators
control the criteria for resolving conflicts between the two updates. The winner of the
conflict can be based on a user-specified priority scheme, a “first one wins” solution, or a
custom resolution using COM and stored procedures.
20
This replication method is best suited for scenarios where both the subscribers and the
publishers of data need to update the articles.
The number of conflicts to be resolved and the complexity of the resolution mechanism
are the principal factors in determining the performance of this type of replication.
Merge replication does not guarantee transaction integrity and provides a high level of site
autonomy because all sites can perform updates to the database.
Transactional Replication
INSERT, UPDATE, and DELETE statements are captured from the transaction logs and
stored in the distribution database to be applied to all the subscribers in the same order.
Transactional replication can also be used to replicate stored procedures whereby each
execution, along with its parameters, is forwarded to the subscribers and executed there.
This reduces the volume of commands and improves performance in some cases.
This method of replication is best suited for near real-time distribution scenarios and for
efficiently replicating data from multiple publishers to a single subscriber.
Transactional replication guarantees transactional consistency and requires that the
subscribers do not update the database.
Replication Implementation Details
The number of network round trips involved in transactional and merge replication is
controlled by the -CommitBatchSize (default = 100), -CommitBatchThreshold (default
= 1000), -TransactionsPerHistory (default = 100), and -PollingInterval (default = 3)
parameters.
The Distribution agent issues a query to get all transactions pending distribution. It then
sends these to the subscriber, committing the commands as determined by the -
CommitBatch parameters. When all available commands have been processed, it sleeps
for the specified PollingInterval, and then reissues the query to get commands pending
distribution.
If the subscriber is running SQL Server, commands are applied to the subscriber in batches
based on command type and memory consumption. For most data changes to tables, the
distribution agent batches up the commands and sends them to the subscriber when the
size of the batched commands exceeds 10 times the packet size (by default about 40 KB)
or when it needs to commit the transaction. If the subscriber is an ODBC data source,
commands are applied individually to the subscriber.
The distribution agent uses ODBC to query the publisher and then uses either ODBC or
OLE DB to transfer the data to the subscribers. The data is transferred in batches as
described earlier.
For snapshot replication, the distribution agent uses the bcp API to transfer the data to
each subscriber. The amount of data being transferred in this method will affect network
usage and performance directly.
21
Recommended Replication Techniques
Use the parameters described in “Replication Implementation Details” to control the number of
network round trips involved, and thus the overall performance. The default values for
determining the batch size may need to be increased based on the number of records being
transferred at a given time. The -PollingInterval parameter can also be increased if real-time
replication is not a requirement.
Connected Stores
Use transactional replication to transfer articles containing individual transactions from each
store to the corporate database periodically throughout the day (for example, once a shift). The
store databases are the publishers of these articles and the corporate database is the sole
subscriber.
Snapshot replication can be used to initialize the price updates and other lookup information at
each store. Thereafter, in general, transactional replication can be used to distribute the
changes to these types of information on a daily basis. If the number of changes is anticipated
to be unusually large at any given point, snapshot replication could be used to reinitialize each
store database, and transactional replication could be restored thereafter. The corporate
database is the publisher of these articles, and each store database is a subscriber.
Disconnected or Occasionally Connected Stores
Use merge replication to transfer all the changes since the last synchronization from the stores
(the publishers) to the corporate database (the subscriber). The frequency of the
synchronization will depend on the frequency of the connection to the corporate network.
Snapshot replication can be used to initialize the price updates and other lookup information at
each store. Thereafter, in general, transactional replication can be used to distribute the
changes to these types of information each time that a connection is established. If the number
of changes is anticipated to be unusually large at any given point, snapshot replication could be
used to reinitialize each store database, and transactional replication could be restored
thereafter. The corporate database is the publisher of these articles, and each store database is a
subscriber.
Pros and Cons for Replication as a Polling Solution
Pros
Replication provides a scalable and reliable solution for guaranteed delivery of data from
one or more sources to one or more destination databases. It also supports the transfer of
data to databases that are not running SQL Server (although it does not provide the ability
to transform the data).
The three replication methods detailed earlier use network resources efficiently (as
controlled by user-defined parameters) and provide centralized administration of all the
data publishers and subscribers in the enterprise through a graphical interface in SQL
Server Enterprise Manager. Replication can be customized easily (for each store if
necessary) to control the frequency of the data transfer as well as the type of replication
performed.
Replication does not require any custom application development because the ability to
ensure that the same data is not transferred more than once, as well as the ability to handle
22
network and hardware errors gracefully, is built into the solution. Detailed error handling
is also provided without any custom development to facilitate troubleshooting and remote
administration.
Cons
Replication does not integrate into other existing software distribution solutions (such as
SMS or Connect:Remote) and requires the use of SQL Server Enterprise Manager for
administrative purposes. This adds complexity to the overall deployment and support
effort because two solutions and tools have to be implemented to handle software
distribution and polling.
Replication provides limited flexibility in the selection of the data to be transferred
because only horizontal or vertical partitions of database tables can be selected for
replication. The ability to gather data from multiple tables into one rowset, which can then
be transferred to the destination, is not provided by replication. This could increase the
volume of data being transferred depending upon the database structure and the replication
requirements.
Replication is unproven as a polling solution in the retail enterprise environment and
hence provides no reference installations in terms of polling.
bcp Utility
An alternative solution for polling can be developed using the bcp utility provided in
SQL Server 7.0. This utility allows for the data in specified SQL Server tables to be
exported to operating system files as well as for the data in the specified files to be
imported into SQL Server tables. The bcp utility can be executed at the command prompt
and scheduled through a batch file.
In addition, the bcp API can be used to develop a custom application that performs the
necessary data transfers, and that can be executed by any standard scheduling system.
A data transfer solution built using the bcp utility can be integrated into the existing
software distribution solution (such as SMS or Connect:Remote). The custom application
will use the bcp API calls to export the data from the source database tables into
individual files, which can then be transported through the existing software distribution
solution to the target server, where another custom application will be executed to import
the data.
bcp does not provide much flexibility in data selection and forces the creation of an
individual file for each database table participating in the transfer. Furthermore, custom
logic will have to be included in the application to ensure that the same data is not
transferred more than once, and to handle network or hardware failures gracefully during
the transfer process.
Administration will be simplified because it will be integrated into the existing software
distribution solution, and will not require an additional monitoring tool. The custom
application should provide specific error handling capabilities to facilitate troubleshooting
and remote administration for the operations staff.
File compression techniques can be used to maximize the available bandwidth.
This solution involves additional software development and maintenance costs for the
custom applications, but provides the benefit of integrating into a proven and reliable
existing software distribution solution.
23
Data Transformation Services (DTS)
Data Transformation Services (DTS) is another utility provided in SQL Server, which can
be used to develop custom polling solutions that integrate into the existing software
distribution system. It provides the ability to create a package of tasks that can be
scheduled and executed to run in sequence.
DTS also provides an object model consisting of 32-bit COM objects that can be used to
develop custom applications in Microsoft Visual Basic®, Microsoft Visual J++®, or
Microsoft Visual C++®.
DTS provides greater flexibility in the selection of the data by allowing the transfer of the
results of a custom Transact-SQL query or stored procedure. In this manner, all of the
necessary data can be gathered into a single result set, which can then be exported and
transferred to the target server using the infrastructure of the existing software distribution
solution.
DTS uses a high-speed in-process COM server to transfer OLE DB rowsets directly from
the source to the destination database. It supports data transfer between heterogeneous
databases. Its extendable COM-based architecture facilitates complex data validations and
transformations as the data moves from the source to the destination.
The custom application should be designed to integrate into the existing software
distribution solution so that an additional monitoring tool is not required. The custom
application will also require additional logic to prevent the same data from being
transferred more than once, as well as to handle network and hardware failures gracefully.
Error handling should be included to specify the nature and location of the errors to
facilitate troubleshooting and remote administration.
This solution involves additional software development and maintenance costs for the
custom applications, but provides the benefit of integrating into a proven and reliable
existing software distribution solution.
Two-Phase Commit
The two-phase commit (2PC) involves the real-time update of both the source and
destination databases bound by a transaction, for example, either both databases are
updated or neither is updated.
This solution for polling provides no support for disconnected stores and will incur a
heavy performance penalty at the POS devices if the link to corporate is a slow WAN or
dial-up link.
It also prevents the POS device from processing transactions if the WAN or dial-up link is
down, which is usually unacceptable for a retail environment.
24
Troubleshooting
This topic discusses the issues involved in troubleshooting and supporting SQL Server 7.0
installations at the stores and the in-store systems in general. The ability to provide remote
troubleshooting and monitoring support is part of the priority for remote administration in an
in-store environment. This topic details issues such as support infrastructure, automated alerts,
and remote monitoring support in SQL Server 7.0.
Support Infrastructure
As mentioned in the deployment section, RAS can be used to provide the primary dial-in
support for the SQL Server database. A custom application that incorporates RAS is a
solid, low-cost solution for performing file transfers, database operations, and remote
program execution.
Remote control software installed on an in-store computer provides an additional level of
support capability to the corporate helpdesk staff. However, it does add to the per-store
cost for the organization.
In-store databases can be monitored from a central location using SQL Server Enterprise
Manager. This utility can also be used to monitor replication tasks, database maintenance
routines, and polling tasks.
Automated Alerts
SQL Server 7.0 provides alerts to monitor the messages appearing in the Windows NT
event log. The alerts can be triggered based on severity levels, specific error numbers, or
even error messages containing a specific word or phrase.
By default, severity levels 19 through 25 trigger an alert because they represent
SQL Server error messages. In addition, alerts must be set up for handling errors occurring
in any of the custom jobs created on the server. Using the RAISERROR WITH LOG
statement within the jobs can trigger custom alerts.
Alerts should be set up on the master server, which also receives the status of the jobs
from each target server. In this manner, jobs and alerts can be defined centrally and
executed in a distributed manner.
In response to alerts, administrators can page an operator, send an e-mail, forward the
event to another server, or even execute a specific program. Any combination of these
actions must be used to provide real-time notification to the administrators of the problems
at one or more stores.
SQL Server provides support for Simple Network Management Protocol (SNMP), which
provides the ability to monitor the status of SQL Server installations over multiple
platforms (such as UNIX). This also allows for integration with network management tools
that are SNMP-compliant. Besides alerts, you can also monitor performance information,
access databases, and view server and database configuration parameters on these
platforms.
25
Remote Monitoring
SQL Server provides utilities to monitor SQL statements, including stored procedures, and
current user and hardware activity.
SQL Server Profiler can be used to monitor all activity related to SQL Server, such as the
execution of stored procedures or batch statements.
Windows NT Performance Monitor can be used to monitor hardware and other system
activity on the server.
These utilities are generally used on an as-needed basis to troubleshoot specific problems
at a particular store.
SQL Server provides support for remote monitoring and troubleshooting, in addition to all the
remote administration capabilities. Automated alerts should be used to identify problems on a
real-time basis, and RAS can be used as the primary infrastructure for providing connectivity
to the store databases.
26
Licensing
Microsoft provides a cost-effective and flexible licensing model for SQL Server 7.0 in an in-
store environment. This includes the ability to license limited-use and full-use versions of the
product as well as the ability for Independent Software Vendors (ISVs) to embed SQL Server
as part of their solution. Furthermore, the Microsoft Data Engine (MSDE) technology, which
provides local data storage that is compatible with SQL Server 7.0, will be included with the
release of Office 2000 and will be licensed in a manner similar to Microsoft Jet. This allows all
development in Office 2000 to be SQL Server-compliant, and creates a seamless upgrade path
to the Enterprise version of SQL Server 7.0.
SQL Server 7.0 is currently available in three different editions: Standard, Enterprise, and
Small Business Server (SBS). In addition, users covered by a per-seat license from any of these
three editions can install a Desktop edition installation on their client computer.
The performance of the Small Business Server Edition is limited to the throughput typical
of 50 concurrent users, although individual SBS installations may be licensed for fewer
than 50 users. The performance of the Standard and Enterprise editions is limited only by
the hardware and operating systems on which they run.
The Desktop edition is a fully featured relational database management system (RDBMS),
targeted for personal and shared use, that runs on the Microsoft Windows 95 or Windows
98 operating systems, or on Windows NT Workstation. The Desktop edition shares the
same codebase as SQL Server 7.0 and is available on the SQL Server 7.0 and SQL Server
7.0, Enterprise Edition compact discs. The Desktop edition may be deployed and used on
each device to which a per-seat SQL Server Client Access License has been dedicated.
The Desktop edition scales effectively over two Symmetrical Multi-Processing (SMP)
CPUs and has optimizations to minimize memory usage. Furthermore, the Desktop edition
supports full merge and snapshot replication. It supports only subscriptions to transaction
replications. It cannot publish transaction replications.
In addition, the Desktop edition does not include the SQL Server OLAP Services software
or the English Query capabilities present in the Standard and Enterprise versions. Features
such as parallel queries, failover clustering, and extended memory addressing are not
supported in this edition.
The SBS edition runs on the Microsoft BackOffice Small Business Server and has a 10
GB size limitation on a per-database basis. It scales over four SMP CPUs and provides
full-text search capabilities. However, features such as OLAP Services, failover clustering,
and extended memory addressing are not supported in this edition.
The Standard edition includes SQL Server OLAP Services as well as English Query and
contains no performance or load controlling mechanism. It supports up to four SMP CPUs
and 2 GB of RAM. The Standard edition does not support failover clustering or extended
memory addressing.
27
The Enterprise edition includes all the features contained in the Standard edition and adds
failover clustering and support for up to 32 SMP CPUs and 3 GB of RAM. It runs only on
the Windows NT, Enterprise Edition operating system.
All editions of SQL Server 7.0 contain fixes for known year 2000 issues. For the most
current year 2000 compliance information on SQL Server 7.0, see
www.microsoft.com/year2000/.
For more information about these editions, see SQL Server Books Online. For the latest
licensing and pricing information, see www.microsoft.com/sql/.
28
Conclusion
Microsoft SQL Server 7.0 is a compelling choice for an in-store database platform primarily
due to its built-in capabilities for remote administration, scalable and reliable transaction
processing, polling solutions for different scenarios, and its market-leading price/performance
ratio. This document detailed the keys to a successful implementation of SQL Server 7.0 in an
in-store environment, which can be summarized as follows:
Infrastructure issues including choice of network architecture, protocols, and Windows NT
domain models will affect general network performance. Fast Ethernet running TCP/IP
only is the preferred network architecture. The workgroup, or no domain, model is a good
low-administration solution for an in-store environment.
Automated database expansion carries a penalty if overused, and should be avoided by
sizing the databases appropriately during installation. Initial sizing is also important for
system databases such as master, tempdb, and the transaction log files.
Avoid the use of complex triggers for multi-row updates because it degrades performance.
Exploit the power of stored procedures and the new index storage mechanism during
application development.
Analyze the size and complexity of the deployment requirements along with the costs
involved in the development and maintenance of custom software in selecting the proper
software distribution and management solution. For small organizations with minimal
software distribution needs, custom applications, which incorporate RAS for disconnected
stores, provide a flexible, low-cost solution. However, for most organizations, third-party
solutions such as SMS or Connect:Remote, along with some customized scripts, provide
the most scalable and reliable solution for software distribution and desktop management.
Remote control software should be used to provide monitoring capabilities for the in-store
devices.
Set up a dedicated master server at the corporate site and enlist each in-store server as a
target server. Jobs and alerts can then be defined and monitored centrally while being
executed on each target server.
Full, differential, and transaction log backups should be used to provide up-to-the-minute
recovery for the in-store database.
The Database Maintenance Plan Wizard should be used to schedule periodic integrity
checks and update the statistics used by the query optimizer.
In selecting the appropriate solution for polling, examine the software development and
maintenance costs involved, along with the need to integrate the administrative tools for
software distribution and polling.
SQL Server 7.0 provides three major types of replication for use as a solution to the
common in-store task of polling if integration of administration tools is considered a lower
priority than reducing software development and maintenance costs. Transactional
replication works best for transferring data on an ongoing basis between the in-store
database and the corporate database. In addition, occasionally connected stores can be
supported using merge replication.
If custom development is not a restricting factor, the bcp and DTS utilities in SQL Server
7.0 provide the ability to integrate the polling solution with a software distribution
29
solution. This reduces the administrative costs as well as the deployment and support
complexity.
The in-store infrastructure along with SQL Server utilities such as SQL Server Profiler,
Windows NT Performance Monitor, and SQL Server Enterprise Manager should be used
to provide full support for centralized helpdesk and troubleshooting operations.
30
Finding More Information
Microsoft SQL Server Books Online provides information about SQL Server architecture and
database tuning along with complete documentation on command syntax and administration.
SQL Server Books Online can be installed from the SQL Server installation media on any
SQL Server client or server installation. It is recommended that any computer with someone
actively working on SQL Server have SQL Server Books Online installed on the hard disk for
easy access.
For the latest information about Microsoft SQL Server, including other white papers on
SQL Server 7.0, see the Microsoft SQL Server Web site at http://www.microsoft.com/sql/.
For the most current year 2000 compliance information on SQL Server 7.0, see
www.microsoft.com/year2000/.
For detailed performance tuning information about SQL Server 7.0, see the Microsoft white
paper “Microsoft SQL Server 7.0 Performance Tuning Guide” at
http://msdn.microsoft.com/developer/sqlserver/sql7perftune.htm/.
Compaq has updated its RAID white paper, which provides information about database server
performance. The Microsoft SQL Server-specific information in this white paper pertains to
version 6.5 and is not applicable to SQL Server 7.0. The white paper, “Configuring Compaq
RAID Technology for Database Servers” is located at
http://www.compaq.com/support/techpubs/whitepapers/ecg0110598.html.
A white paper from Compaq’s Windows NT integration team, “Disk Subsystem Performance
and Scalability” is located at
http://www.compaq.com/support/techpubs/whitepapers/ecg0250997.html. It details hardware
performance characteristics of Compaq hard drives and physical drive behavior. The
information contained in this paper will be applicable to SCSI hard drives available from
Compaq or other vendors.
31
Get documents about "