Docstoc

Disaster recovery

Document Sample
Disaster recovery Powered By Docstoc
					WHAT IS DISASTER? ................................................................................................................................ 2

DISASTER RECOVERY PLAN: ............................................................................................................... 2

TOOLS FOR RECOVERY: ........................................................................................................................ 3
    BACKUP ...................................................................................................................................................... 3
    BACKUP MODES ......................................................................................................................................... 3
    OFFLINE ..................................................................................................................................................... 4
    ONLINE....................................................................................................................................................... 4
OFFLINE VS ONLINE ............................................................................................................................... 4
    OFFLINE ..................................................................................................................................................... 4
    ONLINE....................................................................................................................................................... 4
    OFFLINE BACKUP USING OPERATING SYSTEM BACKUP UTILITY ................................................................ 5
    DATA TO BACKUP ....................................................................................................................................... 5
    SAP BACKUP UTILITIES ............................................................................................................................. 5
    BACKUP METHODS AND BACKUP FREQUENCY ............................................................................................ 6
    BACKING UP COMPLETE DATABASE ........................................................................................................... 6
    BACKING UP A TABLESPACE ...................................................................................................................... 7
    BACKING UP A CONTROL FILE ................................................................................................................... 7
    BACKING UP THE ARCHIVE FILES .............................................................................................................. 7
BACKING UP TO DISK, THEN TO TAPE .............................................................................................. 8
    EXPORT AND IMPORT ................................................................................................................................. 8
ERD’S:........................................................................................................................................................... 8

RAID .............................................................................................................................................................. 8

RECOVERY MANAGER (RMAN) ..........................................................................................................11
    W HY USE RMAN? ...................................................................................................................................12
    RMAN MEDIA RECOVERY: BASIC STEPS: ................................................................................................13
DISASTERS AND RECOVERIES: ...........................................................................................................15

OPERATING SYSTEM FAILURE: .........................................................................................................15

DATABASE FAILURE: .............................................................................................................................16

STANDBY DATABASE: ............................................................................................................................17

HARDWARE FAILURE ............................................................................................................................25
What is Disaster?
If you lose your entire system (possibly including hardware), and have not made any
special security precautions for such an occurrence, then you have to recover the
system as much as possible, step by step. Disaster can mean anything: theft, flood, an
earthquake, a virus or anything could keep you from accessing your data.

Disaster recovery plan:
When creating a disaster recovery plan (DRP), the most important part of the plan lies in
identifying what “disaster” means to you and your organization. Obviously, permanently
losing all of your organization‟s data would be a disaster, but what else would? How
about your installation becoming inaccessible for a week or longer? When planning for
disaster, think about all the conditions that could render your data or your workplace
unreachable, and plan accordingly.
A disaster recovery plan, sometimes referred to as a business continuity plan (BCP) or
business process contingency plan (BPCP) - describes how an organization is to deal
with potential disasters. Just as a disaster is an event that makes the continuation of
normal functions impossible, a disaster recovery plan consists of the precautions taken
so that the effects of a disaster will be minimized, and the organization will be able to
either maintain or quickly resume mission-critical functions. Typically, disaster recovery
planning involves an analysis of business processes and continuity needs; it may also
include a significant focus on disaster prevention.
Disaster recovery is becoming an increasingly important aspect of enterprise computing.
As devices, systems, and networks become ever more complex, there are simply more
things that can go wrong. As a consequence, recovery plans have also become more
complex. Current enterprise systems tend to be too complicated for such simple and
hands-on approaches, however, and interruption of service or loss of data can have
serious financial impact, whether directly or through loss of customer confidence.

Appropriate plans vary a great from one enterprise to another, depending on variables
such as the type of business, the processes involved, and the level of security needed.
Disaster recovery planning may be developed within an organization or purchased as a
software application or a service. It is not unusual for an enterprise to spend 25% of its
information technology budget on disaster recovery.

Risk Analysis

The first step in drafting a disaster recovery plan is conducting a thorough risk analysis
of your R/3 systems. List all the possible risks that threaten system uptime and evaluate
how imminent they are in your Organization. Anything that can cause a system outage is
a threat, from relatively common manmade threats like virus attacks and accidental data
deletions to more rare natural threats like floods and fires. Determine which of your
threats are the most likely to occur and prioritize them using a simple system: rank each
threat in two important categories, probability and impact. In each category, rate the
risks as low, medium, or high.

Establish the Budget:

Once you've figured out your risks, ask 'what can we do to suppress them, and how
much will it cost?' Can I detect a threat before it hits? How do I reduce the potential of it
occurring? How do I minimize its impact to the business?
The results of Step 1 should be a comprehensive list of possible threats, each with its
corresponding solution and cost. It is imperative that IT presents all of these threats to
the business operations units, so they can make an informed decision regarding the size
of the disaster recovery budget.

Develop the Plan:

The feedback from the business units will begin to shape your Disaster Recovery Plan
procedures. If, for example, they determine that the company must be up within 48 hours
of an incident to stay viable, then you can calculate the amount of time it would take to
execute the recovery plan and have the business back up in that timeframe.
The recovery procedure should be written in a detailed plan or "script." Establish a
Recovery Team from among the IT staff and assign specific recovery duties to each
member.
Define how to deal with the loss of various aspects of the network (databases, servers,
bridges/routers, communications links, etc.) and specify who arranges for repairs or
reconstruction and how the data recovery process occurs. The script will also outline
priorities for the recovery: What needs to be recovered first? What is the communication
procedure for the initial respondents? To complement the script, create a checklist or
test procedure to verify that everything is back to normal once repairs and data recovery
have taken place.

Test, Test, Test:

Once your Disaster Recovery Plan is set, test it frequently. Eventually you'll need to
perform a component-level restoration of your largest databases to get a realistic
assessment of your recovery procedure, but a periodic walk-through of the procedure
with the Recovery Team will assure that everyone knows their roles. Test the systems
you're going to use in recovery regularly to validate that all the pieces work. Always
record your test results and update the Disaster Recovery Plan to address any
shortcomings.


TOOLS FOR RECOVERY:
To accomplish the Recovery procedure we have a lot of facilities.

   BACKUP
     o Offline Backup (Local and remote location)
     o Online Backup (Local and remote location)
   ERD‟S
   RAID
   RMAN


Backup
By Backup we mean to make copies of files from one destination to another. Both, the
source destination and the target destination can be on the same or on different storage
device. You can restore your data from the backup if the original files are damaged or
lost.


Backup Modes
There are two types of backup.
         Offline
         Online
Offline

Offline backup is taken with the database stopped - that is, the users cannot work.

In an offline backup of the complete database, you have a backup of the database that is
consistent. If you work with the database after the backup, the backup is consistent, but
not up-to-date. In this case, you have to recover the database after you restore the
backup.

Online
Online backup is taken with the database running - that is, the users can continue to
work normally. The management of database changes by the corresponding Oracle
background processes is not affected either.

Offline Vs Online

Offline
An offline backup is taken with the Oracle database and SAP R/3 System down (that is,
not running).

Benefits:

          An offline backup is faster than an online backup.
         There is no issue with data changing in the database during the backup.
         Related operating system level files will be in sync with the R/3 database, if they
          are backed up at the same time.

Disadvantages:

          R/3 is normally not available for use during an offline backup. If the “split mirror”
          technique is used, the database will be shutdown for a short time, while R/3
          continues to run.
          Buffers for R/3 and database are flushed. This will impact performance until the
          buffers are populated.

Online
An online backup is taken with both the Oracle database and SAP R/3 running.

Benefits:

         SAP R/3 is available to users during backup.

Having a system that is available 24-hours a day may be critical in manufacturing or
other areas where the system provides information to global users.

         Buffers are not flushed.

Since buffers are not flushed, there is no impact to performance once the backup is
complete.
Disadvantages:

      An online backup is somewhat slower than an offline backup, resulting in a
       longer backup time. Backup time is increased because processes such as SAP
       R/3 are running and competing for system resources.
      Online performance is degraded while the backup is running.
      Data may change in the database while it is being backed up. Because of this,
       the redo logs become critical to a successful recovery.
      Related operating system level files may be out of sync with the R/3 database.



Offline backup using Operating System Backup Utility
The offline backup is done when SAP R/3 and the database are down. Here, we also
use the offline backup to also backup other files which are needed to restore SAP R/3.
Since high capacity tape drives are now more common, it is simple and safe to backup
the entire server. This full server backup eliminates the possibility of not backing up an
important file.

The data in the database does not change while the backup is being made, which
means that you have a static “picture” of the database and do not have to deal with the
issue of data changing while the backup is being run. With some third party applications,
you cannot back up the files unless they are closed, and this is not possible unless R/3
and the application are shut down. Therefore, an offline backup needs to be done. A “full
server” offline backup also gives you the most complete backup in the event of a
catastrophic disaster. On one tape, you have everything on the server.


Data to backup
Offline backup:
     Operating system executable files.
     Program files.
     Oracle executable files.
     Oracle Data files.
     Original and mirror log files.
     Control files.
     Sap executable files.

Online backup:
    Oracle Data files.
    Original and mirror log files.
    Control files
    Offline Redo log files.


SAP Backup Utilities
SAP offers the utility programs BRBACKUP, BRARCHIVE and BRRESTORE. Use these
programs to realize the recommended backup concepts. Each of these programs has its
own range of functions (backup, archiving the redo log files, restore). In addition, the BR
programs also have their own utility programs, BRCONNECT and BRTOOLS. These are
not called by the database user, the BR programs call them themselves for a particular
task.




Backup methods and Backup frequency
Backing up complete database

      The frequency of image backups of the entire database should depend on the
       degree of activity in your database system. High activity in the database
       increases the number of redo log files written between complete backups. This
       increases the time required for any necessary recovery.

      The safety level of your database operations is increased the more often you run
       backups. If a redo log file is lost, complete recovery of the database after an error
       is often not possible. Instead, you will only be able to recover up to the gap in the
       redo log file sequence.

      Performing frequent complete backups reduces the number of redo log files
       which must exist in order to make a complete recovery. The data loss should one
       of these files be lost can thus be kept to a minimum.

      To perform an online backup, the database must be in ARCHIVELOG mode;
       production operations can then be continued without restrictions, unlike and
       offline backup.

      SAP recommends keeping several generations of complete backups and the
       corresponding redo log files. This ensures that you can still recover the database,
       even if the last complete backup is lost.

      To enable fast, simple recovery of the database, back up at least the changed
       tablespaces and the control file after every structure change (new, changed, or
       deleted tablespaces; new data files). After reorganization with data files, always
       back up the affected tablespace if you want to use the recovery functions of
       SAPDBA. Follow the instructions for the tablespace backup below.

      Use the SAP utility BRBACKUP to back up the database and BRARCHIVE to
       archive the offline redo log files

Backing up a Tablespace

Backing up tablespaces that are changed frequently can reduce the time required for
any necessary recovery. When a more recent backup of an intensively used tablespace
is available, fewer redo log entries will have to be processed in order to update the
tablespace. If you are able to back up the entire database on a daily basis, tablespace
backups are not necessary.

Note that tablespace backups are no replacement for frequent backups of the entire
database.

      If you only perform tablespace backups for a long period of time, this increases
       your dependence on the archived redo log files, and therefore the risk of data
       loss if one of the redo log files is lost.

      If tablespace backups are used, the database administrator decides what has to
       be backed up. The SAP utility BRBACKUP supports the backup operation itself,
       but does not help you decide which tablespaces to back up (exceptions:
       operations such as a tablespace extension, or SAPDBA reorganization; in these
       situations, SAPDBA recommends that you backup the tablespace immediately).


Backing up a Control file

Another type of partial backup is to back up the control file. The control file records the
physical file structure of the corresponding database. You should therefore back up the
control file after every structure change.

Mirrored control files protect you against the loss of a single control file. If data files are
damaged, an older control file that mirrors the corresponding structure of the database
may be necessary for recovery. For this reason, mirroring the control files is by no
means a replacement for backing up the control file after every change in the structure of
the database.

When the SAP utility BRBACKUP is used to back up the database files, the control file is
always saved along with them. The control file is saved before and after the operation for
various administration measures with SAPDBA (for example, tablespace extension, and
reorganization of a tablespace).

Backing up the Archive files
The archive files should be backed up regularly so as to in case of any instance or
media failure, the database can be restored and recovered with the help of archive
backup set.

In case of point in time recovery or incomplete recovery, the archives are the only files
available which can be use to bring back the database consistent.

The archive files are used in different types of incomplete recoveries.
Backing up to disk, then to Tape

Advantages:

              For the database, this option is the fastest. Under most situations, you
               can back up to disk faster than to tape.
              This option allows you to make several identical backup copies (for
               example, one for onsite storage and one for offsite storage).
              Once the backup has been made to disk, R/3 System performance is
               minimally affected. Because the tape backup is made from the disk copy,
               and not the live database, the backup to tape is not competing with
               database activity for significant system resources.
              During an onsite disaster recovery to the same equipment, the recovery
               can be done from the on-disk backup.

Disadvantages:

              Significant additional disk space, up to the same amount of space as the
               database, is required. This additional space makes this option the most
               expensive, especially for a large database.
              Until the backup to tape is completed, you are vulnerable to a data center
               disaster.


Export and Import
The dump files created by the export can also be used for recovery. The advantage of
this method is that a single object or a client can also be restored through import
process.

ERD’s:

Emergency Repair Disk (ERD) creation procedure has been integrated with Microsoft
Servers in case of registry corruption. Registry is the main Database of Operating
system with holds all the information related to Hardware and Software installed on the
machine. It is recommended that ERD to be updated fortnightly.


RAID

In this section I take a look at the "single" RAID levels-meaning, the "regular" RAID
levels, as opposed to multiple or nested RAID levels. Single RAID levels are by far the
most commonly used since they are simpler and less expensive to implement, and
satisfy the needs of most RAID users. Generally, only very high-end or specialty
applications require the use of multiple RAID levels.

There are eight "regular" RAID levels, which are used to varying degrees in the "real
world" today. A few levels, especially RAID 0, RAID 1 and RAID 5, are extremely
popular, while a couple is rarely if ever seen in modern systems.

RAID Level 0
The simplest RAID level, RAID 0 should really be called "AID", since it involves no
redundancy. Files are broken into stripes of a size dictated by the user-defined stripe
size of the array, and stripes are sent to each disk in the array. Giving up redundancy
allows this RAID level the best overall performance characteristics of the single RAID
levels, especially for its cost. For this reason, it is becoming increasingly popular by
performance-seekers, especially in the lower end of the marketplace.




RAID LEVEL 1:

RAID 1 is usually implemented as mirroring; a drive has its data duplicated on two
different drives using either a hardware RAID controller or software (generally via the
operating system). If either drive fails, the other continues to function as a single drive
until the failed drive is replaced. Conceptually simple, RAID 1 is popular for those who
require fault tolerance and don't need top-notch read performance. A variant of RAID 1 is
duplexing, which duplicates the controller card as well as the drive, providing tolerance
against failures of either a drive or a controller. It is much less commonly seen than
straight mirroring.




RAID Level 2

Level 2 is the "black sheep" of the RAID family, because it is the only RAID level that
does not use one or more of the "standard" techniques of mirroring, striping and/or
parity. RAID 2 uses something similar to striping with parity, but not the same as what is
used by RAID levels 3 to 7. It is implemented by splitting data at the bit level and
spreading it over a number of data disks and a number of redundancy disks. The
redundant bits are calculated using Hamming codes, a form of error correcting code
(ECC). Each time something is to be written to the array these codes are calculated and
written along side the data to dedicated ECC disks; when the data is read back these
ECC codes are read as well to confirm that no errors have occurred since the data was
written. If a single-bit error occurs, it can be corrected "on the fly". If this sounds similar
to the way that ECC is used within hard disks today, that's for a good reason: it's pretty
much exactly the same. It's also the same concept used for ECC protection of system
memory.

Level 2 is the only RAID level of the ones defined by the original Berkeley document that
is not used today, for a variety of reasons. It is expensive and often requires many
drives. The controller required was complex, specialized and expensive. The
performance of RAID 2 is also rather substandard in transactional environments due to
the bit-level striping.

RAID Level 3

Under RAID 3, data is striped across multiple disks at a byte level; the exact number of
bytes sent in each stripe varies but is typically under 1024. The parity information is sent
to a dedicated parity disk, but the failure of any disk in the array can be tolerated (i.e.,
the dedicated parity disk doesn't represent a single point of failure in the array.) The
dedicated parity disk does generally serve as a performance bottleneck, especially for
random writes, because it must be accessed any time anything is sent to the array; this
is contrasted to distributed-parity levels such as RAID 5 which improve write
performance by using distributed parity (though they still suffer from large overheads on
writes, as described here). RAID 3 differs from RAID 4 only in the size of the stripes sent
to the various disks.




RAID Level 4

RAID 4 improves performance by striping data across many disks in blocks, and
provides fault tolerance through a dedicated parity disk. This makes it in some ways the
"middle sibling" in a family of close relatives, RAID levels 3, 4 and 5. It is like RAID 3
except that it uses blocks instead of bytes for striping, and like RAID 5 except that it uses
dedicated parity instead of distributed parity. Going from byte to block striping improves
random access performance compared to RAID 3, but the dedicated parity disk remains
a bottleneck, especially for random write performance. Fault tolerance, format efficiency
and many other attributes are the same as for RAID 3 and RAID 5.




RAID Level 5

One of the most popular RAID levels, RAID 5 stripes both data and parity information
across three or more drives. It is similar to RAID 4 except that it exchanges the
dedicated parity drive for a distributed parity algorithm, writing data and parity blocks
across all the drives in the array. This removes the "bottleneck" that the dedicated parity
drive represents, improving write performance slightly and allowing somewhat better
parallelism in a multiple-transaction environment, though the overhead necessary in
dealing with the parity continues to bog down writes. Fault tolerance is maintained by
ensuring that the parity information for any given block of data is placed on a drive
separate from those used to store the data itself. The performance of a RAID 5 array can
be "adjusted" by trying different stripe sizes until one is found that is well-matched to the
application being used.




Recovery Manager (RMAN)

Recovery Manager (RMAN) is an Oracle utility that can back up, restore, and recover
database files. The product is a feature of the Oracle database server and does not
require separate installation.
Recovery Manager is a client/server application that uses database server sessions to
perform backup and recovery. It stores metadata about its operations in the control file of
the target database and, optionally, in a recovery catalog schema in an Oracle database.

You can invoke RMAN as a command-line executable from the operating system prompt
or use some RMAN features through the Enterprise Manager GUI.

It is to be noted that the hardware cost increases when using RMAN as we need
additional database on another server.

Why Use RMAN?

Most production database systems impose stringent requirements on backup and
recovery. As a DBA in charge of backup and recovery, you must:

      Manage the complexity of backup and recovery operations

      Minimize the possibility of human error

      Make backups scalable and reliable

      Utilize all available media hardware

      Make backups proportional to the size of transactional changes, not to the size of
       database

      Make recovery time proportional to the amount of data recovered

You have two basic methods for performing these backup and recovery tasks on an
Oracle release 8.0 or higher database:

      Using operating system commands to perform backup and restore operations,
       and SQL or SQL*Plus statements to perform recovery

      Using Recovery Manager for backup, restore, and recovery

Why use one method rather than the other?

As illustrated in following Figure, RMAN uses server sessions to perform backup and
recovery operations and stores metadata in a repository. RMAN automates backup and
recovery, whereas the user-managed method requires you to keep track of all database
files and backups. For example, instead of requiring you to locate backups for each
datafile, copy them to the correct place using operating system commands, and choose
which logs to apply, RMAN manages these tasks automatically.
RMAN Media Recovery: Basic Steps:
If possible, make the recovery catalog available to perform the media recovery. If it is not
available, then RMAN uses metadata from the target database control file. If both the
control file and recovery catalog are lost, then you can still recover the database--
assuming that you have backups of the datafiles and at least one auto backup of the
control file.

The generic steps for media recovery using RMAN are as follows:

   1. Place the database in the appropriate state: mounted or open. For example,
      mount the database when performing whole database recovery, or open the
      database when performing online tablespace recovery.

   2. To perform incomplete recovery, use the SET UNTIL command to specify the
      time, SCN, or log sequence number at which recovery terminates. Alternatively,
      specify the UNTIL clause on the RESTORE and RECOVER commands.

   3. Restore the necessary files using the RESTORE command.

   4. Recover the datafiles using the RECOVER command.

   5. Place the database in its normal state. For example, open it or bring recovered
      tablespaces online.

Performing RMAN Media Recovery:

Following Figure illustrates an example of RMAN media recovery. The DBA runs the
following commands:
RESTORE DATABASE;
RECOVER DATABASE;

RMAN then queries the repository, which in this example is a recovery catalog. The
recovery catalog obtains its metadata from the target database control file. RMAN then
decides which backup sets to restore, and which incremental backups and archived logs
to use for recovery. A server session on the target database instance performs the
actual work of restore and recovery.
How RMAN Searches for Archived Redo Logs During Recovery:
If RMAN cannot find an incremental backup, then it looks in the repository for the names
of archived redo logs to use for recovery. Oracle records an archived log in the control
file whenever one of the following occurs:

      The archiver process archives a redo log

      RMAN restores an archived log

      The RMAN COPY command copies a log

      The RMAN CATALOG command catalogs a user-managed backup of an
       archived log

RMAN propagates archived log data into the recovery catalog during resynchronization,
classifying archived logs as image copies. You can view the log information through:

      The LIST command

      The V$ARCHIVED_LOG control file view

      The RC_ARCHIVED_LOG recovery catalog view

During recovery, RMAN looks for the needed logs using the filenames specified in the
V$ARCHIVED_LOG view. If the logs were created in multiple destinations or were
generated by the COPY, CATALOG, or RESTORE commands, then multiple, identical
copies of each log sequence number exist on disk. RMAN does not have a preference
for one copy over another during recovery: all copies of a log sequence number listed as
AVAILABLE are candidates. In a sense, RMAN is blind to the fact that the logs were
generated in different destinations or in different ways.
DISASTERS AND RECOVERIES:

Operating system failure:
Strategies to Recover if Operating System Fails

There are chances of failure or crash of operating system at various situations.
The possible reasons of failure of operating system are;

      ELECTRIC SHOCK
      UPS FAILURE
      HARD DISK FAILURE
      PROBLEM IN ANY OTHER HARDWARE e.g. RAM,MOTHERBOARD,VGA ,
       DATA BUSES etc.
      ABNORMAL SHUTDOWN OF SERVER
      HAPHAZARD INSTALLATION OF UNNECESSARY SOFTWARES
      VIRUS ATTACK
      PASSWORD POLICY

Recommendations/Suggestions

To save the system from possible crash there are number of options that must be taken
into consideration;

      To save the system from electric shock there is dire need of having power
       system solely dedicated for a particular LAN setup ,for that purpose a separate
       electric power-system powered by generator is a necessary pre-requisite.
      If ups fails to support the system/servers , there are chances that operating
       system may be saved but in extreme case there is a possibility of crash of
       operating system.For that purpose Repair Option during Operating System
       installation must be utilized from operating system CD.
      If hard disk failure occurs then first recover the hardware.
      Do not install unnecessary softwares on the servers/system , for that purpose
       view the error logs daily to rectify the problem.
      Try not to abnormally shutting down of servers/system.If it happens recover the
       system from the last known good configuration settings
      To recover the crashed operating system ERDs must be prepared weekly to
       save the data alongwith the full backup.ERDs stores the registry information as
       well needed for operating system recovery.
      If operating system crashes and blue screen appears, then fresh installation of
       operating system is recommended on the same location where previous
       operating system is residing but with new name of the folder.
      If operating system crashes due to virus attack ,then virus Removal Tools must
       be used .To save and secure the system,weekly updation of virus definition is
       must.But first of all do arrange for installation of original anit-virus software.
      To secure the system and protect it from any hazard operating system‟s
       password must be changed frequently and make it difficult for others to guess
       it.There must be a clear difference between user policy and administrative policy.
Database failure:
Full Restore and Recovery
The function Full restore and recovery lets you reset the database to a particular Point in
Time. You can use this function after:

   Failures caused by user errors (such as logical errors)
   Failures during maintenance (such as upgrade errors or data transfer errors)
The function Full restore and recovery supports you in the following scenarios:
Scenario 1
There is an error during an upgrade. You want to recover the database to the point in
time before the upgrade.

In the first step, SAPDBA restores the last complete backup without control files and
online redo log files, and in the second step recovers the database.
Scenario 2
A logical error occurred during normal database operations that was only recognized
later. You want to recover the database to the point in time before the error.

In the first step, SAPDBA restores the last complete backup without control files and
online redo log files, and in the second step imports the redo log files and recovers the
database.
Scenario 3
A logical error occurred during normal database operations that was only recognized
later. The structure of the database was changed between the error and the last
complete backup. You want to recover the database to the point in time before the error.

In the first step, SAPDBA restores the last complete backup without control files and
online redo log files, and in the second step recovers the structural changes ( CREATE
DATA <filename> AS <filespec> ). In the third, the redo log files are imported and the
database recovered.
Scenario 4
A logical error occurred during normal database operations that was only recognized
later. The structure of the database was changed several times between the error and
the last complete backup. You want to recover the database to the point in time before
the error.

In the first step, SAPDBA restores the last complete backup without control files and
online redo log files, and in the second step restores the last incremental backup before
the error. The changes to the structure are included in the incremental backup. In the
third step, the structural changes made between the incremental backup and the logical
error are recovered ( CREATE DATA <filename> AS <filespec> ). In the fourth step, the
redo log files are imported and the database recovered.
Scenario 5
A logical error occurred during normal database operations that was only recognized
later. The database was reorganized between the error occurring and its discovery. You
want to recover the database to the point in time before the error.

In the first step, SAPDBA restores the last complete backup without control files and
online redo log files, in the second step restores the control files as they were before the
reorganization. During the reorganization the control files were backed up in the
directory <ORACLE_HOME>/sapreorg/ . In the third step, the redo log files are imported
and the database recovered. (Recovery with the option and USING BACKUP
CONTROLFILE).

Procedure
Start the function Full restore and recovery from the SAPDBA menu entries:

    Full restore and recovery
    DATABASE STATE                        NOMOUNT | MOUNT | OPEN
    RESTORE/RECOVER                       allowed | allowed
                                          Current setting
A   - Select a backup of type             Full backup | whole backup
b   - Select incremental backup run       (Only for selected full backup)
c   - Recover until                       now | point in time
d   - Show status
e   - Restore and recover
q   - Return

Choose Select a backup of type to display a list of all possible whole and full backups
(online and offline). Choose the appropriate backup for your needs. If you choose a full
backup you can choose the accompanying incremental backup with Select incremental
backup run.
Choose Recover until to enter the point in time to which you want to recover the
database. You can choose between Recover until now and a point in time of your choice
(Recover until YYYY-MM-DD HH.MM.SS ).
Choose Show status to display:
     the backup you want to use.
     the point in time to which you want to recover the database and which redo log
        files are needed for this.
     whether changes to the structure have been made between the chosen backup
        and the recovery point in time (for example, after a reorganization with data files
        or a tablespace extension).
     whether an operation with the chosen settings is allowed.

After the Full restore and recovery, the command ALTER DATABASE OPEN
RESETLOGS is always executed. For security reasons, make a backup before you open
the database again. After you open the database, the current LOG SEQUENCE
NUMBER = 1 and its operations overwrite the old redo log files. Back up the offline redo
log files before the function is executed.
Standby Database:

The standby database is supported officially by ORACLE as of Version 7.3. The
ORACLE documentation contains detailed information on this database configuration.
The following sections provide an overview of the main features of the disaster recovery
configuration.

Standby Database Configuration:

When the primary (productive) database is duplicated on a standby database, this is
referred to as a standby database configuration. The aim of this configuration is to
minimize downtime if the primary database suffers an error, since the standby database
can assume the role of the productive database in a very short time. The following
diagram illustrates the standby database concept.
Two identically configured databases operate on two identically configured hosts.

      The primary (productive) ORACLE instance is located on the first host, the
       database is open and fully available for all SQL prompts of the R/3 System. The
       primary database system is also the system which directly executes all database
       requests.

      The standby database is a copy of the primary database and is only intended as
       a recovery system.

The standby ORACLE instance on the second host is in a mounted standby state (not
opened!) and is recovered constantly. This means that the standby instance
incorporates all changes to the data of the primary instance either immediately, or with a
slight delay. To do this, the offline redo log files created in the primary database system
are imported (only the redo entries already archived by ORACLE can be imported). The
standby instance therefore „follows‟ the state of the primary instance.

If it is necessary to recover the primary database system (for example, after a media
error), the standby instance can assume the functions of the primary instance in very
short time („takeover‟). The recovery mode of the standby instance is therefore ended,
and the standby database opened for online operation.

Since all data files are already located on the standby host, costly reloading of the files is
avoided. Some redo entries may still need to be applied to the files to enable all
transactions to be incorporated in the standby instance. This means that you must first
import the missing offline redo log files from the primary instance. You can then try to
archive the current online redo log file of the primary instance with the ORACLE
command ALTER SYSTEM ARCHIVE LOG CURRENT and also to import these redo
entries in the standby instance. If this command fails, the current online redo log file can
be copied to the standby host. It may be possible to directly import the redo entries from
the online redo log file.

After the takeover, a standby database needs to be set up again (usually on what was
the primary host).
NOTE:

Changes to the physical structure of the primary database (creating new files, renaming
files, changes to online redo log and control files) are not automatically incorporated in
the standby database in every case. The DBA may need to intervene depending on the
type of change.

If it is not possible to incorporate the changes automatically, the recovery process is
stopped, and the DBA needs to intervene manually to incorporate the structural change
in the standby database. After that, the recovery process needs to be started again.

Renaming of database files in the standby database is not supported by BRBACKUP.
The original names of the primary database need to be retained.

If commands are executed in the primary database with the UNRECOVERABLE option,
these changes do not appear in the redo log files. It is therefore not possible for the
standby instance to receive any information about such changes. In this case, no error
messages appear during the recovery process. They are, however, recorded in the
standby database ALERT file. You should therefore check the ALERT file regularly.

You will find more detailed information in the ORACLE documentation. The new and/or
changed SQL and SVRMGR commands are also described there as well as the
necessary init.ora parameters, which are required for working with a standby database.

Standby Database: Support by BRARCHIVE:

In the standby database scenario, transfer of the offline redo log files from the primary to
the standby instance can be controlled by the SAP utility BRARCHIVE. This is possible,
since BRARCHIVE is able to copy offline redo log files to a hard disk.
A BRARCHIVE process runs on the primary host. It copies the offline redo log files to a
mounted directory, which represents the archive directory (usually saparch ) on the
standby host. The copy process runs through the network, BRARCHIVE must therefore
be used with the verify option ( -w ).

A BRARCHIVE process also runs on the standby host. This process waits for the offline
redo log files created in the mounted archive directory. If a redo log file was copied
completely, BRARCHIVE assumes the task of importing these redo entries into the
standby instance (option -m|- modify ), backing up the redo log file and deleting it if
necessary.
BRARCHIVE therefore initiates the recovery process of the standby database, in which
the offline redo log files are processed individually.

Importing the redo entries can be delayed by a few minutes (the delay is specified in the
option -m <Delay(Minutes)> ). If a logical error occurs in the primary instance (for
example, accidental deletion of a table), it is possible to prevent this error from being
imported in the standby instance.
The offline redo log files are imported with the following ORACLE command: RECOVER
STANDBY DATABASE ;

-m|-modify

Input syntax: -m [<delay>]

Default value: No delay

Delay: The offline redo log files which are created are sent to the standby database
before they are processed. This could happen with a delay time of <delay> minutes after
creating the ORACLE offline redo log file.
If there is a standby database, brarchive -m must be called in order to transfer the offline
redo log files.

NOTE:

To import the redo log files, the DB user (usually SYSTEM ) must have the SYSDBA
authorization.

When redo entries are imported in which a structural change of the primary database is
recorded, the BRARCHIVE process is terminated with the following ORACLE errors:

ORA-01670: new datafile <file_id> needed for standby database recovery

ORA-01157: cannot identify data file <file_id> - file not found

ORA-01110: data file <file_id> : „ <file_name> „

The structural change now needs to be incorporated manually in the standby database.
To do this, you can use the command
ALTER DATABASE CREATE DATAFILE ‘<file_name>‘;
BRARCHIVE can then be started again.

Standby Database: Backup with BRBACKUP:

Use

One main advantage of the standby database scenario is that backups do not have to be
performed in the primary (production) database system. Instead it allows the datasets of
the standby database to be backed up. This means that the database backup does not
add to the load on the host of the primary database instance. Since online operation
does not occur on the standby database, all the host‟s resources can therefore be made
available for the database backup. The SAP utility program BRBACKUP backs up the
standby database data.

Prerequisites

   1. The standby instance is in the recovery state and must not be opened. You can
      make an offline backup only. For the BRBACKUP backup of the standby data,
      you must set the init<DBSID>.sap parameter backup_type = offline_standby .

   2. Renaming of database files in the standby database is not supported by
      BRBACKUP. The original names of the primary database need to be retained.

   3. As for the OPS configuration, certain requirements must be satisfied for the
      connection to a remote host.

Features




      BRBACKUP logs on to the primary database instance (entries for the instance
       string in the init<DBSID>.sap parameter primary_db ) and retrieves the
       information required on the database structure. This information is entered in the
       backup logs.

      The standby database instance is stopped.

      BRBACKUP backs up the standby data.

      After the backup has been made, the original state of the standby database
       instance is recovered. If the database was in a recovery state, this state is
       restored (ORACLE commands STARTUP NOMOUNT, ALTER DATABASE
       MOUNT STANDBY DATABASE).
Activities

To create and configure a standby database you can make a BRBACKUP backup
(backup_type = offline_stop) of the production database. This is not opened later:
Instead it is transferred directly into the mount standby mode, where it takes over the
role of the standby database. The backup becomes the production system.

To recreate a standby configuration, you can make a copy of the production database.
You can then use this as the standby database ( backup_dev_type=disk_standby ).




primary_db

This parameter is only relevant if the disaster recovery configuration is being used. The
connect string to the primary database instance is defined with this parameter so that
BRBACKUP can log onto the primary host.

Default value: None

primary_db = <connect_string>

<connect_string>: Connect string („SQL*Net database specification string‟) from the
standby host to the primary (production) database.

primary_db = C11

Structure-Retaining Database Copy:

Use

With BRBACKUP you can make a copy of the database files which has exactly the same
directory structure. You can use this type of database to

      generate a test system from a production system

      Set up a Standby Database Scenario.

      have a database backup available which saves you the restoration process
       during a recovery. In this case the Oracle Home directory will be renamed as the
       new Oracle Home directory of the database copy (set link). The copied files are
       then the current files and the offline redo log files can be imported directly.

      change the location of the database files (file system or raw device). Database
       copy is the only way of moving database files from one file system to raw devices
       (or vice versa) by means of BRBACKUP.

Prerequisites

The following directories must be created on the target database:

      sapdata directories

      sapbackup directory

      origlogA , origlogB , mirrlogA , mirrlogB directories of the online redo log files

The corresponding subdirectories are created automatically during copying.
/oracle/c11/sapdata2/stabd_1/stabd.data1 is copied to
/oracle/c12/sapdata2/stabd_1/stabd.data1

Since this type of copy is a 1:1 copy no software compression may take place.



To make the copy of the database you have to define the name of the new
Database_Home directory (of the database copy) in the init<SID>.sap profile parameter
new_db_home. In addition to this set the parameter backup_dev_type to disk_copy or
call up BRBACKUP with the relevant command option brbackup -d|-device disk_copy.



Under Windows NT, the sapdata directories can be distributed across several drives.
When you make the copy, you can retain this distribution by specifying the appropriate
target drives. see brbackup m|-mode).

-m|-mode:

Input syntax:
-m all|<tablespace_name>|all_data|incr|full|<file_ID>|<file_ID1>-
<file_ID2>|<generic_path>|<object_list>|sap_dir|ora_dir

Default: all

You can perform a full database backup or back up specific tablespaces or files (whether
part of the database or not). You can create object lists.

You can specify what you want to back up:

       all : Back up the complete database

In a Structure Retaining Database Copy ( backup_dev_type = disk_copy or
disk_standby ) you can retain the distribution of the sapdata directories to different

drives (only for Windows NT).

The files of the drive d are copied to drive k , the files of the drive e are copied to the
drive l and the files of the drive f are copied to the drive m .

brbackup -m all,d:=k:,e:=l:,f:=m:
If you do not specify a target drive, the files will be copied to the directory defined in the
parameter

new_db_home

This parameter must be set if you want to make a database copy using BRBACKUP
(backup_dev_type = disk_copy). The name of the home directory of the database copy
is defined in new_db_home .

Default: none

new_db_home = <dir>

NT : <dir> is the new SAP database directory: <drive>\oracle\<SID>

This directory must also contain the sapbackup directory.

Under Windows NT, the sapdata directories can be distributed across several drives.
When making a database copy, a target drive can be specified for each drive (see m|-
mode). If you do not specify a target drive, the files will be copied to the directory defined
in the parameter.

      <tablespace name> : Back up the files of one tablespace.

      all_data : Back up the files of all tablespaces, except for pure index tablespaces.

      incr: Incremental backup with RMAN.

      full : Incremental Backup

      <file_ID> : Back up a data file with the specified ORACLE file ID as file ID.
       Control files can be addressed with the file ID 0.
       Online redo log files can be addressed using the file ID 0<n>, <n> is the redo log
       group number. To address all the online redo log files, use file ID 00.

      <file_ID1>-<file_ID2> : Back up the files specified in the file ID interval. The
       specified file IDs must be known in the database.

      <generic_path> : Enter a complete path to back up the required database file,
       non-database file, or directory. Specify a generic path to back up all the database
       data files whose name starts with that path. In this case, the path must contain at
       least the ORACLE_HOME directory and an additional generic specification (for
       example, sapdata<n> ) in the path.

When you specify a directory to be backed up its contents and the names of the
subdirectories are backed up. However the directory structure and the content of the
subdirectories are not backed up.

      <object list> : You can specify a list of tablespaces or files, or combine the key
       word all with an object list. The individual objects are separated by commas
       (commas only, no blanks!).

      sap_dir : With this option, you can automatically determine and save all the files
       of the SAP environment. This means that the following directory trees are saved:
       /sapmnt/<SAPSID> , /usr/sap/<SAPSID> and /usr/sap/trans . If possible, these
       directories should be backed separately. You can only use this option when
       saving to tape without verifying the backup.
       ora_dir : With this option, you can automatically determine and save all the non-
        database files of the ORACLE environment. This means that the directory trees
        are saved in <ORACLE_HOME> (except for the sapdata<n> and saplog- or
        origlog/mirrlog directories). If possible, save these directories in a separate
        backup run. You can only use this option when saving to tape without verifying
        the backup.

For UNIX systems: Start BRBACKUP to save the SAP/ORACLE environment ( brbackup
-m sap_dir | ora_dir ) under user root , as otherwise you will not have the authorizations
required for the directory to be saved.

Saving and restoring under root also has the advantage that you can be sure that the
settings for the user and authorizations for the files and directories will be kept after
restoring.

Parameters in init<DBSID>.sap : backup_mode.

If you want to repeatedly back up several tablespaces and/or files, it may be more
effective to configure parameter backup_mode of the initialization profile accordingly.



backup_mode

This parameter is used by BRBACKUP and BRRESTORE to determine the scope of the
backup/restore activity.

Default: all

Possible values:

all : Back up the entire database using BRBACKUP or restore all tablespaces (without
control files or redo log files) using BRRESTORE.

Hardware Failure

The risk of hardware failure is the most commonly talked-about reason to perform
backups. Indeed, nothing will jolt someone into realizing the importance of backups more
than an unrecoverable hard disk failure. Since the hard disk stores your main programs
and data, it is the hardware whose failure hurts the most. It is also what gets the most
attention, and rightly so.

To protect data loss you should have configured RAID level as described above. Also,
for safety you should have two Array controller and SCSII disks should be distributed
equally.

It is also important that your hard disk should be hot plug-in (i.e. incase of disk failure the
disk could be replaced while the system is running).

There must be at least two Network Cards to protect the system against Network Card
failure.

                                       THE END

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:73
posted:3/31/2011
language:English
pages:26