SharePoint 2010 Disaster Recovery

Document Sample
SharePoint 2010 Disaster Recovery Powered By Docstoc
					                                SharePoint 2010 Disaster Recovery
           This document covers the backup recovery scenarios in a typical SharePoint 2010 environment.




Document Information

Status                    Release
Release Date              27.05.2011
Version                   1.0
Filename                  TVD SharePoint 2010 Disaster Recovery.docx
Category                  Trivadis Best Practices
Product                   SharePoint 2010
Author                    Stephan.Hurni@trivadis.com;Stephan.Jola@trivadis.com
Customer                  Trivadis AG
Document Versioning

Version               Date        Author                        Comments
1.0                   27.5.2011   Stephan Hurni, Stephan Jola
Contents
Conventions and Features used in this document ................................................................. 1
Text conventions ....................................................................................................................... 1
Responsible ................................................................................................................................ 1
Referenced Documents............................................................................................................. 1
1         Introduction ..................................................................................................................... 1
2         Key concepts and terms .................................................................................................. 2
2.1       Disaster Recovery (DR) ................................................................................................................................ 2
2.2       Recovery time objective (RTO) ................................................................................................................. 2
2.3       Recovery point objective (RPO) ............................................................................................................... 2
2.4       Architectural overview ................................................................................................................................. 3
2.5       Disaster Recovery Manager ....................................................................................................................... 3
3         Backup objective ............................................................................................................. 4
3.1       Resources.......................................................................................................................................................... 4
3.1.1 Documentation............................................................................................................................................... 4
3.2       Backup Targets ............................................................................................................................................... 6
3.2.1 Database Backup and Maintenance ....................................................................................................... 6
3.2.2 SharePoint Databases .................................................................................................................................. 7
3.2.3 Service Applications...................................................................................................................................... 8
4         Restore objective ............................................................................................................. 9
4.1       Restore Targets general .............................................................................................................................. 9
4.2       Priority List........................................................................................................................................................ 9
4.2.1 Windows Servers............................................................................................................................................ 9
4.2.2 SQL Server databases................................................................................................................................... 9
4.2.3 SharePoint Servers ........................................................................................................................................ 9
4.3       Restore Targets SharePoint .................................................................................................................... 10
4.3.1 SharePoint configuration databases ................................................................................................... 10
4.3.2 SharePoint content databases ............................................................................................................... 10
4.3.3 SharePoint Configuration ........................................................................................................................ 10
4.3.4 Service Applications................................................................................................................................... 10
5         Testing objective ........................................................................................................... 11
5.1       System Tests ................................................................................................................................................. 11
5.2       Recovery complete .................................................................................................................................... 11
Appendix .................................................................................................................................. 12
Conventions and Features used in this document
This document uses special text and design conventions to make it easier for you to find the
Information you need.

Text conventions

On the Insert tab, the galleries include items that are designed to coordinate with the overall
look of your document.


 Convention                        Feature



 Italicized type                   Italicized type is used to indicate a reference to an object on the document (Figure, Chapter)
 Italicized type with underline    This type is used to reference to an external source (document)
 “Apostrophe”                      Is used for references to company, organization, clubs and institutions




Responsible

 Name                     Job Title/ Company                                   Responsibilities

 Stephan Hurni            Principal Consultant, Trivadis                       SQL Server

 Stephan Jola             Consultant, Trivadis                                 SharePoint




Referenced Documents

 ID       Document Title                                                     Owner                Status
                                                                             Chapter 1: Introduction




1        Introduction
With the installation, configuration and operation of a SharePoint 2010 Farm, typically the IT
operations team is responsible for the SharePoint servers deployed in the corporate IT
infrastructure. But the mission to be accomplished is much more than monitoring and
optimizing the platform. What happens if a disaster strikes a SharePoint Server, or even worse,
the entire farm? Do you have a rock-solid disaster recovery? Are the Servers, Services,
Applications and Configurations well documented and are the documents up to date?
 This document covers the essential points to ensure a solid backup recovery plan in case of
an unexpected failure or data corruption, which requires the intervention of the according IT
administrators.
  If you take into account, regarding this guide, you’re prepared. But this takes serious testing
in your environment either! If you are uncertain, in terms of this recommendations, do not
hesitate to contact the authors.




                                               -1-
                                                                                             Chapter 2: Key concepts and terms




2            Key concepts and terms

2.1          Disaster Recovery (DR)

While speaking of a disaster Recovery, this means that the SharePoint farm is in a failure state
and cannot be brought back online in the expected amount of time. As a part of the DR plan, it
must be specified as a part of the business continuity specifications, defined by the business
owners, how much unplanned downtime the organization tolerates before experiencing
significant negative business impacts.

2.2          Recovery time objective (RTO)

The RTO defines the time to get the system or data back to the operational status after a data
corruption or disaster. During the RTO, all required restore or recovery steps, including the
corresponding actions, are performed by the relevant responsible persons.

2.3          Recovery point objective (RPO)

If a disaster happens (data corruption, including unintentional data deletion or manipulation),
what is the maximum acceptable amount of data loss? The RPO defines the time between the
last data backup and the disaster.
                                                    Identified and declared disaster                    Full Recovery done, back
                  Last Backup                                                                                    online




      Time                                                 KABOOM!!!



                                              RPO
                                                                                       RTO



 Figure 1 RPO and RTO for a SharePoint farm

The implementation of a disaster recovery as illustrated in Figure 1 RPO and RTO for a
SharePoint farm requires the specification of the RTO and RPO as a part of the SharePoint
business continuity plan (BCP).




                                                               -2-
                                                                      Chapter 2: Key concepts and terms



2.4       Architectural overview

Professional server environments are deployed in different stages. Typically, this is
Development, Integration/Testing and Production. For SharePoint environments, these stages
can be optimally used to do regular restore operations from production data to the
Integration/Testing stage. This will help to identify the amount of time in the different steps to
recover and get better knowledge about the RTO in a disaster recovery case. A further benefit
is that people are being trained and have the required knowledge for a successful restore of
the SharePoint farm in system incidents.
  If a disaster strikes the SharePoint environment, it will be crucial to identify what parts of the
whole system are affected. This paper assumes that all tiers of the SharePoint system are
affected. If in a specific case only parts of the system are in a nonfunctional state, exact analysis
is essential to bring the system back to a flawless state. The different layers are presentation,
application, service and data.

2.5       Disaster Recovery Manager

When the disaster recovery is initiated, we recommend identifying a person who takes over the
Recovery Manager Role during the recovery process. If the SharePoint Farm is damaged or
offline, there is a high pressure from the management to bring the farm back online as soon as
possible. And things can go wrong, especially when no professional coordination is done on
the process.
  The Disaster Recovery Manager must be defined before a disaster strikes. He is the owner of
the process. We do not recommend assigning this role to a technical person, depending on the
farm size or involved instances to successful recover the farm.


He is responsible for the following tasks:

     Regularly checks the technical responsible persons in the company (they are probably in
      holidays, or has left the company).

     Checks the system documentation, are they up to date?

     Specifies the communication matrix (to whom, how and interval).

     Coordinates the resources in the recovery process.

     Collects the issues and monitors the ongoing steps to estimate the progress.

     He is the single point of contact, protects the technical persons from disrupting they’re
      tasks.




                                                 -3-
                                                                         Chapter 3: Backup objective



3             Backup objective

3.1           Resources

It is in the nature of a failure event, that things can go “head over heels” if they are not
properly planned and defined. Therefore all the resources have to be named and periodically
confirmed. This is basically:

     System documentations (logically and physically), including product versions

     Software and Product Keys

     Certificates, passphrases to build the farm connectivity

     Decision-Makers: Domain Administrators, Server Administrators, Backup Administrators,
      Database Administrators, SharePoint Farm Administrators, Site Collection owner for the
      business critical sites, Firewall and Network Administrators

     Well Documented test scenarios

3.1.1         Documentation

Logical Architecture
Ensure that, the logical architecture of the farm is well documented and up to date. The logical
architecture model of a system describes the logical components of the system, the role of
each components and how they interact with each other. The following architectural aspects
should be at least included:

     IIS application pools

     SharePoint web applications

     SharePoint service applications

     Zones and associated alternate access mappings

     Web application policies

     Content databases

     Site Collections (incl. host name site collections)

     Sites

     My Sites




                                                   -4-
                                                                      Chapter 3: Backup objective



Physical Deployment
As the logical architecture documentation is, the documentation of the physical deployment is
another important part of the game. This documentation primarily focuses on the hardware
used in the farm. Most commonly well documented elements are:

   Physical servers that both SharePoint and SQL Server use

   Storage equipment

   Networking equipment

   Firewalls that sits between SharePoint servers

   Hardware load balancers or similar specialty equipment

   Windows Active Directory Controllers




                                               -5-
                                                                                           Chapter 3: Backup objective



3.2        Backup Targets

To achieve the goal to effective be able restoring a SharePoint system, safe procedures to
backup all items in the system are essentially!

3.2.1      Database Backup and Maintenance
To be able to restore SQL Server databases, this includes all SharePoint databases as well as all
SQL Server system databases, to a point in time, database backups and transaction log backups
are essential. A proper deployed maintenance contains all the necessary steps and tasks to
achieve a secure database operation for all databases in the specific instance.
 Trivadis has developed Maintenance Scripts to achieve a trouble-free and automated SQL
Server maintenance containing all necessary tasks.

Table 1 SQL Server Job List
Task/Job                      Description                                          Schedule and interval

Full backup                   Full Database Backup                                 Daily

Transaction Log backup        Transaction Log backup                               Every 5-15 min

Index Maintenance             Reorganize or rebuild indexes based on different     Daily
                              parameters

Statistics update             For existing Statistics but auto create statistics   Daily
                              should be off for SharePoint Databases

Clean up                      Clean up History Logs an old Backup files            Daily



To fulfill the point in time restore requirements, the transaction log backup chain of every
database must be supplied. If any TX log backup file is missing, the chain is broken and the
restore is only possible to the last available TX log backup in the chain.




 Figure 2 Transaction log chain (where subsequent Diff-Backup or Full-Backup can be missing but no TX Log Backup)




                                                           -6-
                                                                                         Chapter 3: Backup objective



3.2.2        SharePoint Databases
SharePoint has several SQL server databases, containing configuration data and user content
from SharePoint web applications, site collections, sites, libraries and lists. The backup of the
SharePoint databases varies from configuration databases to service application databases to
content databases.

Configuration Databases
Each SharePoint farm has 1 configuration and 1 central administration content database. Make
regularly backup of those databases. This ensures in case of disaster to retrieve configuration
settings.
 Create configuration database backups within the SharePoint central administration or better
with PowerShell scripts. PowerShell scripts could be integrated in the above mentioned SQL
Server Backup Jobs. Doing so will assure, that the contemporary states of the backup items are
maintained.
 But do never ever make a SQL Server based restore of the configuration databases. This will
damage your SharePoint farm for sure!

Content Databases
Backup of SharePoint content database is done with regular SQL Server database backup as
described before. From a SQL Server point of view, these databases are user databases as every
other user database.
  Therefore, the backup plan and duration depends heavily on the SQL Backup Restore
environment. Typically, a well architected SharePoint farm has content databases with a size
limit up to 500GB. This ensures a well performing backup/restore procedure. A larger database
requires automatically longer backup/restore time windows.

Service Application Databases
Some, but not all, of the service applications depend on their own databases. In fact, that
service applications are not dynamically, we recommend to backing up the service applications
databases within the regular SQL server backup job and/or PowerShell scripts for service
applications that do not have a database in place.
Important note: Some of the service applications require passwords. The secure store service
application for example, requires a password while configuring. This password must be
available to recover the service application.

Table 2 Recommended SharePoint database backup schedule from the SQL server perspective
Database                                   Backup Type            Schedule and interval       Tools

SharePoint configuration                   Full                   Daily                       SQL / PowerShell

SharePoint Central Administration          Full                   Daily                       SQL / PowerShell
Content

SharePoint Service Application Databases   Full                   Daily                       SQL / PowerShell

SharePoint Content Databases               Full                   Daily                       SQL
                                                                                     1
for every Database                         Transaction log        Every 5 to 15min            SQL

1
    Must reflect the required RPO



                                                       -7-
                                                                          Chapter 3: Backup objective



3.2.3         Service Applications

User Profile Service Application
The User Profile Service Application is one of the most tricky service applications in SharePoint.
One of the most important steps is to document the initial configuration steps used to get the
service application successfully up and running.
  Next, the backup if this service application differs from the other service applications. We
recommend using a PowerShell script to back up the two required components of the User
Profile Service Application:

       Service Application Name

       Service Application Proxy Name

The associated databases of the Service Application can also be backed up with SQL
Management Studio. This requires additional steps to ensure, that the backup is restore aware1.

Search Service Application
The Search Service Application can be backed up with either PowerShell or SharePoint Central
Administration.
  SharePoint Server 2010 starts a SQL Server backup of the Search administration database,
crawl databases, and property databases, and also backs up the index partition files in parallel.

Secure Store Service
When the Secure Store Service is configured the first time, a passphrase is entered. It is
important to keep the passphrase in a secure location.
 Every time you change or manipulate the Secure Store Service, the Secure Store Service
Application Database is automatically re-encrypted. Therefore, backing up the Secure Store
Service Application ensures the automatic synchronization of the Master Key and the database.




1
    http://technet.microsoft.com/en-us/library/gg576965.aspx

                                                               -8-
                                                                         Chapter 4: Restore objective




4         Restore objective

4.1       Restore Targets general

The Restore of the backed up targets it listed below in chronological order. Tasks can be
parallelized where it is appropriate.

4.2       Priority List

Each SharePoint WebApp or Site Collection has different business continuity requirements. For
the disaster task force it is important to have a list of WebApp and Site Collection, where the
priority of the item(s) is specified. This ensures the right, or better: optimized, order of the
recovery steps. Business critical items are faster online than other non-critical items.

4.2.1     Windows Servers
To restore the different Server Roles, every Server in the SharePoint system should be built or
deployed as a new, clean setup. Using imaging or bare-metal solutions can lead to inconsistent
situations and server states. Therefore it is recommended to use proper Server deployment
scenarios, add the needed roles and features and install the desired software on top. Then the
specific configuration of the services and applications can be applied based on the
requirements and documentation.

4.2.2     SQL Server databases
After the Windows server to hold the database role is back online and connected to the active
directory domain, the installation of the SQL Server Instance has to be done. Setup and
configure the SQL Server instance with exactly the same service pack and cumulative update
level as the former server was.
 This is followed by restoring the SQL Server system databases to get all system objects,
principals and permissions back to the state of the selected point in time.

4.2.3     SharePoint Servers
Each SharePoint Server, independent of its SharePoint Role, requires a new complete
SharePoint binaries installation in case of a catastrophic failure of the SharePoint farm. Follow
this rule:

     Keep the SharePoint source binaries in a central location (and not on the SharePoint server
      itself).

     Keep the used product key in a central location.

     Ensure the identical product level is installed.

     Keep the Installed SharePoint Solutions in a central location.




                                                   -9-
                                                                         Chapter 4: Restore objective



4.3       Restore Targets SharePoint

The Restore of the backed up SharePoint targets it listed below in chronological order.

4.3.1     SharePoint configuration databases
Lost or undocumented settings about the SharePoint farm can be retrieved from a restored
copy of the configuration database within the SQL server management studio.
 Do never restore a SharePoint configuration database within SQL Server restore operations!
This will fail and damage the farm with unknown side effects!
 Recovering a SharePoint farm configuration must be done with the native SharePoint farm
backup file in Central Admin or within the SharePoint Management Shell.

4.3.2     SharePoint content databases
If a content database is corrupted or otherwise damaged, the database can be restored. There
are situations, where only a part of the content may be affected. Therefore, it is not acceptable
to restore the whole database, instead of only the desired part within the database.
Simple speaking, we distinguish two types of restoring SharePoint content:

     Catastrophic failure. The content database is damaged in a form of useless state. This
      requires the SQL full restore operation of the content database.

     Data failure. The content database is operational, but the content has been modified in a
      way, where a part of the content is useless or manipulated. Then, the content database is
      restored to SQL server to the specified RPO (point in time recovery). The restored database
      requires a different name. The next step is; the SharePoint farm administrator restores the
      content from the database with the unattached database recovery feature in SharePoint.
      When finished, the restored content database can be removed from the SQL server.

4.3.3     SharePoint Configuration
The SharePoint farm configuration is only restored within Central Administration, or with
PowerShell. Do not recover the SharePoint configuration database with SQL native restore jobs.

4.3.4     Service Applications
Restore the Service Applications only with PowerShell or in the Central Administration of
SharePoint.




                                               - 10 -
                                                                      Chapter 5: Testing objective




5        Testing objective

5.1      System Tests

Before opening the SharePoint farm to End Users, ensure that every required task is finished.
This is important to prevent Server reboots after a green light has been committed to the End
Users.
Typically, the System test is divided to several task and responsible persons. The IT
Administrator ensures that all systems are set up with the correct operating system. The
SharePoint Administrator is responsible to finish all required tasks including checks in the
SharePoint log file.
Each WebApp or Site Collection Owner ensures that everything is ok and approves the
functionality of its site collection.

5.2      Recovery complete

The SharePoint Disaster Recovery Manager collects each response of the disaster recovery
team. He is responsible to coordinate the tasks during the recovery phase and also intervenes
on problems. Additionally he documents the identified issues during the recovery process,
delegates the resources and communicates with the business.
He is responsible to specify the Recovery complete state.




                                              - 11 -
                                                           Chapter 5: Testing objective




Appendix
Plan for disaster recovery (SharePoint Server 2010):
http://technet.microsoft.com/en-us/library/ff628971.aspx
Backup a service application:
http://technet.microsoft.com/en-us/library/ee428318.aspx
Restore a service application (SharePoint Server 2010):
http://technet.microsoft.com/en-us/library/ee428305.aspx




                                               - 12 -

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:38
posted:8/28/2011
language:English
pages:16