SQL Agent Best Practices

Document Sample
SQL Agent Best Practices Powered By Docstoc
					SQL Agent Best Practices

      Gert E.R. Drapers
   SQL Server Development
   Customer Advisory Team
• SQL Agent Architecture Overview
• Best practices:
  ●   Security
  ●   Configuration
  ●   Jobs
• Black Belt SQL Agent:
  ●   Async Execution: Self Destructing Jobs
  ●   Dealing with Job State
  ●   Cluster support: Job Failover with Restart
  ●   None sysadmin job execution: Job Dispatcher
  ●   Monitoring: Detecting Runaway Jobs
SQL Agent Architecture                                                                 T-SQL      SQL-DMO
                                                                                       client      client
                                          Job cache                               sp_*                JobServer
                                         Shared Memory

 NT Service

            Base Threads                            SQLAGENT.OUT                         SQL Server
                                                Worker connection
             SQL Agent                                                                   msdb
            Worker Thread                                xp_sqlagent_notify
                                                Alert                                      post
             Sub Systems                        Engine               ERRORLOG



                                                                     NT EventLog

                                                                   Ad-hoc connection
SQL Agent Security Best Practices

• Use separate security contexts
• Remove security info that is not used
• Use Windows Authentication between
  SQL Agent and SQL Server
• Registry and File System ACL‟s
SQL Agent Security Best Practices
• Use machine accounts instead of domain
  ●   Reduces risk in case SQL Agent gets compromised
• Use three different accounts:
  ●   Machine\SQLAgentSvcAccount
       • Use low privileged account, unless you are using: Proxy
         and/or MSX account, or the SQL Server authentication;
         which require local administrator level privileges (to access
         the LSA)
  ●   Machine\SQLAgentProxyAccount
       • Use low privileged account
  ●   Machine\SQLAgentMSXAccount
       • Use low privileged account
SQL Agent Security Best Practices…

• Remove proxy account, when not used!
  ●   xp_sqlagent_proxy_account N‟DEL‟
       • Note: UI does not remove it!
  ●   Make sure that @sysadmin_only = 1
       • sp_get_sqlagent_properties @sysadmin_only = 1
       • Note: This also disable non-sysadmin usage of
• Remove MSX account, when not used!
  ●   xp_sqlagent_msx_account N‟DEL‟
  ●   Note: SQL Agent needs to be running
SQL Agent Security Best Practices…

• Don‟t use standard SQL Server
  authentication between SQL Agent and
  SQL Server
  ●   sp_set_sqlagent_properties
      @regular_connections = 0
• Use SQL EnterpriseManager to change
  service account
  ●   ACL‟s: Registry and File System
  ●   Note: Can use xp_SetSQLSecurity instead
Configuration Best Practices

• Handling MSDB
• Best configuration (Registry) settings
• Managing the worker thread pool
Configuration Best Practices

• Treat MSDB as a regular production database
  ●   Include MSDB in your recovery plans
  ●   Make frequent full database backups
       • Note: msdb always runs uses simple recovery model!
  ●   Script out all jobs and put then in a SCC system
• Do not use attach-detach between servers
  and/or instances to move around jobs.
  ●   Script jobs instead (SQL-EM or SQL-DMO)
Configuration Best Practices…
• MSDB Disaster Recovery
  ●   Option 1 - Restore Database
       • Restore backup, no additional work required
  ●   Option 2 - Recreate
       • Recreate MSDB using INSTALL\INSTMSDB.SQL, assures the
         correct version of SPs, codepage and sort order matching you
         server version.
       • Run backup scripts from SCC to recreate jobs
       • Note: Lost all job history
  ●   Option 3 - Attach or Restore from a different server
       • Make 100% sure to use compatible versions, codepage and sort
       • Note: Have to rename server references to the correct name!
Configuration Best Practices…

• Use verbose logging
  ●   exec msdb.dbo.sp_set_sqlagent_properties
      @errorlogging_level = 7
• Use ANSI SQLAgent.out file
  ●   exec msdb.dbo.sp_set_sqlagent_properties
      @oem_errorlog = 1
• Use NET SEND to machine for servicing up
  ●   exec msdb.dbo.sp_set_sqlagent_properties
      @error_recipient = „machine name‟
Configuration Best Practices…
• Configure your Worker Thread Pool
   ●   Windows 9x max. 8, cannot be changed
   ●   Windows NT default 0 (unlimited)
        • REG_DWORD MaxWorkerThreads = 0
        • If MaxWorkerThreads > 200 effectively use 0
   ●   Max. Threads per Subsystem
        • Defined in Registry
            – CmdExec = 10, Active/X Script = 10, TSQL = 20
   ●   Sub system worker threads exceeded
        • [251] Step %ld of job %s is being queued for the %s subsystem
   ●   Total worker threads exceeded
        • [398] The job (%s) cannot be run because the maximum number of
          jobs (%ld) are already running. Please try this job again later."
Job Best Practices

•   Scheduling
•   T-SQL
•   CmdExec
•   Active/X Script
•   Machine agnostic jobs
SQL Agent Jobs Basics

• Jobs contain 1 or N Job Steps
  ●   A Job Step is associated with a sub-system
       • CmdExec, T-SQL (ODBC), Active/X Script
  ●   In a multi-step job, steps are dependent based on
      "success" or "failure" of previous job step
  ●   Job steps properties: retry count, retry interval, output
      file (T-SQL and CmdExec)
• Only one instance of a job can be executing at
  all times
• Max job step size is NVARCHAR(3200)
Job Execution Model
Job flow
• Job Steps dependencies stored in:
   ●   sysjobsteps.on_success_action
        •   1 = Quit with success (default)
        •   2 = Quit with failure
        •   3 = Go to next step
        •   4 = Go to step on_success_step_id
   ●   sysjobsteps.on_success_step_id
   ●   sysjobsteps.on_fail_action
        •   1 = Quit with success
        •   2 = Quit with failure (default)
        •   3 = Go to next step
        •   4 = Go to step on_fail_step_id
   ●   sysjobsteps.on_fail_step_id
• Retries and Retry timeouts
   ●   sysjobsteps.retry_attempts
   ●   sysjobsteps.retry_interval
Job Schedules
• Jobs can refer to 0 to N schedules
   ●   Schedules can invoke a job:
        •   On Start of SQL Agent
        •   On CPU idle
        •   On Time X
        •   Recurring (daily, weekly, monthly etc.)
   ●   Keep it simply!
   ●   Use job steps to start serial jobs, do not try to time schedule them.
• Overlapping schedules
   ●   There can only be one instance of a given job at a certain time
   ●   The job “next time to run”, is only determined at startup and after each
       time the job is finished. Or if the schedule of a job is changed (signaled
       by sp_sqlagent_notify)
• Smallest schedule recurrence is 60 seconds
• Schedule can be disabled
Schedule Best Practices
• Don‟t use overlapping schedules
• Recurrence smaller < 60 seconds
  ●   Use sleep in job step with looping
  ●   Or use self creating and destructing jobs
• Schedules can get disabled!
  ●   [191] Warning [%ld]: Possible date calculation spin for
      Schedule %ld
  ●   [192] Date calculation spin detected for Schedule %ld
  ●   List jobs with disabled schedules:
       • select name, job_id from msdb.dbo.sysjobs
         where job_id in (select job_id from
         msdb.dbo.sysjobschedules where enabled = 0)
T-SQL Job Steps
• Use ODBC connection
• LoginTimeout default = 30s
• Current database set by Job Step
   ●   sysjobsteps.database_name
• RaisError fails the job step
• Default SET options
  SET LANGUAGE us_english **
  (** language specific, based on server)
T-SQL Best Practices

• Wrap the payload inside stored procedures if
  ●   Allows larger scripts (> 3200 char)
  ●   Allows for better, easier and more controlled error
      handling of the T-SQL and the success and failure of
      the job step
• Do not change job / job steps /step
  dependencies at runtime
• Prevent dynamic job building
  ●   Job steps that add job steps to the job at runtime
• T-SQL in job steps does require error handling!
CmdExec Job Steps

• Executes using cmd.exe /c <command>
  ●   Win9x /c <command>
• sysjobsteps.cmdexec_success_code
  ●   Determines success or failure errorlevel
• GetExitCodeProcess() used for step exit code
• Force success or failure
  ●   Add Exit 0 or Exit 1 (assume default
      cmdexec_success_code = 0) after last command
CmdExec Best Practices
• Do NOT (as in NEVER) launch Windows applications,
  only command line tools.
   ●   Use EXETYPE to determine
   ●   Will cause a hang of your job and cause resource problems for
       the service
• Know your command line tools return codes
   ●   Create a batch file .BAT or .CMD which executes the
       commandline and prints the return code using @echo errorlevel,
       0 (zero) is default for success
• Test security context
   ●   When accessing network resources like file shares, pipes etc.
   ●   Know your environment variables
        • Schedule a SET command
Active/X Script Job Steps

• Return success or failure by setting
  ●   Main = 0 or Main = 1
• Script host executes on own thread
• Global object SQLActiveScriptHost
  ●   Writing to the job history
  ●    SQLActiveScriptHost.Print "Hello World"
• Supported script engines
  ●   xp_enum_activescriptengines
  ●   Tested with VB Script, ECMAScript (JScript) and Perl
      Script 5.6
Active/X Script Practices

• Always destroy created objects
  ●   Set obj = Nothing
• Turn off JIT debugging on production machines
  ●   HKCU\Software\Microsoft\Windows
      REG_DWORD = 0
  ●   If running as LocalSystem update the Default User
• Do not invoke UI components
  ●   SQL Agent does not have a Desktop attached
Job Step Tokens
• Tokens allow for making job steps independent
  of a server and/or job specific information
• Works like a pre-processor
  ●   Tokens are replace at runtime, before the job step is
      executed by the associated sub-system.
  ●   Tokens are case-sensitive
• Common problems/mistakes
  ●   Value based replacement, so you still need to add
      quotes for example or convert data types
  ●   Tokens can cause naming conflicts after the search
      and replace
• To troubleshoot, use print statements
Job Step Tokens…
[A-DBN]    Alert database name    [SQLDIR]   SQLServer root
[A-SVR]    Alert server name                 directory (eg. c:\mssql7)
[A-ERR]    Alert error number     [STRTDT]   Job start time
[A-SEV]    Alert error severity   [STRTTM]   Job start date
[A-MSG]    Alert message text     [LOGIN]    SQL login ID (if
                                             connecting non-trusted)
[JOBID]    Job id as GUID         [PSWD]     SQL password (if
                                             connecting non-trusted)
[STEPID]   Step id as tinyint     [OSCMD]    Command interpreter
[DATE]     Current date                      prefix (ie.
[TIME]     Current time                      ' /c' or
[MACH]     Machine name                      'cmd.exe /c')
           (computer name)        [SRVR]     Server name (include
[MSSA]     Master                            instances name if
           SQLServerAgent name               server is an instance)
[STEPCT]   Number of times this   [INST]     Instance name, returns
           step has executed                 empty string if default
           (excluding retries)               instance
  Tokens Example (Active/X Script)
Set SS = CreateObject("SQLDMO.SQLServer")
Set BB = CreateObject("SQLDMO.Backup")
Set DB = CreateObject("SQLDMO.Database")
SS.LoginSecure = True
SS.Connect "[SRVR]"
For Each DB In SS.databases
If DB.Name <> "tempdb" and DB.Name <> "model" _
       And DB.Status <> 992 Then
       BB.Database = DB.Name
       BB.Files = "[SQLDIR]\BACKUP\" + BB.Database + _
       BB.SQLBackup SS
End If


set DB = Nothing
set BB = Nothing
set SS = Nothing
Tokens Examples

  ●   Retrieve the name of the current executing
       • select name from msdb.dbo.sysjobs where
         job_id = convert(uniqueidentifier, [JOBID])
• CmdExec
  ●   Step 1
  ●   Step 2
Best of the best practices

• Async Execution: Self Destructing Jobs
• Dealing with Job State
• Cluster support: Job Failover with Restart
• None sysadmin job execution: Job
• Monitoring: Detecting Runaway Jobs
Async Execution:
Self Destructing Jobs
• Goal: Allow SQL statements to be executed
• sp_async_execute @stmt nvarchar(4000)
  ●   Add record to sysjobqueue table
• sp_async_executor
  ●   Stored Procedure that runs inside recurring loop,
      reading sysjobqueue and creates self destructing jobs
      for each entry
  ●   Removes entry from sysjobqueue when job is created
      and started
  ●   Optionally job does not destruct on failure to help
Dealing with Job State
• Create state table with job_id as key
  ●   Store: Key with Name – Value pairs
  ●   Note: System guarantees that there is only one
      instance of job X at all times!
  ●   Use Global temp table ##sysjobstate or permanent
      table in msdb
       • Note: global temp table needs to be created in autostart proc
         to guarantee lifetime
       • Note: everybody has access to global temp table, security
• Use helper procedures to:
  ●   Clean out state at beginning of Job
  ●   Get & Set state inside Job Step
Cluster support:
Job Failover with Restart
• Use job state table (see Dealing with Job State)
  ●   Add startup job step which stores: started state, start
      time and node name to state table
       • Using tokes:
          – [STRTDT] Job start time
          – [STRTTM] Job start date
          – [SRVR]   Server name
  ●   Add cleanup job step which stores: finished state, end
      time and node name to state table
• Add autostart job, which reads from state table,
  to detect jobs in started state and restarts them
• Note: Can also use a permanent table with
  restart logic information
None sysadmin Jobs Execution:
Job Dispatcher
• Allows non-sysadmin role members to run jobs
• sysjobinvocationrights
   ●   mapping between SID‟s and job_id‟s
• sysjobinvocations
   ●   records which jobs are invoked by who
• sp_invoke_job
   ●   checks if user is allowed to invoke, if true adds record to
• sp_job_invoker
   ●   Stored procedure run inside a recurring job, which does the
       actual sp_job_start on behalf of the invoker
• sp_job_trigger (optional)
   ●   Stored procedure that “triggers” sp_job_invoker using
Detecting Runaway Jobs
• Compare against average execution time
  recorded in sysjobhistory
  ●   Can be done for jobs and job steps
• For scheduled jobs use:
  ●   xp_sqlagent_enum_jobs
       • If (Running == 1)
          – [Next Run Date] and [Next Run] represents the time the
            job was started
• For non scheduled jobs use:
  ●   select

• SQL Agent Best Practices:
  ●   Security
  ●   Configuration
  ●   Jobs
• SQL Agent Enhanced Usage:
  ●   Enable async T-SQL execution
  ●   How to deal with Job State
  ●   How to restart jobs on failover
  ●   Allowing none sysadmin job execution
  ●   How to detect runaway jobs
      ? Questions ?

 or e-mail GertD@SQLDev.Net

Samples and slides are posted on

Shared By: