Integration of RMAN with RAC by cuiliqing


									                                                                         Architecture and Infrastructure

                        Integration of RMAN and RAC

                                                            Ashok Singh,Fastenal Company
During the last few years Real Application Clusters have seen a significant growth throughout the world
primarily because the business applications need high availability and any downtime means a substantial loss in
revenue and reputation .RAC when combined with Data Guard is the obvious choice for any business looking
for a Maximum Availability Architecture (MAA) .The implementation of RAC has been greatly simplified due
to the improvements in the real time Cache Fusion and the shared storage. Starting with Oracle 9i, RAC offers
OCFS (Oracle Cluster File System) for Linux and windows users but Oracle 10g further simplified the shared
storage with the introduction of Automatic Storage Management (ASM) .Any OCFS or ASM implementation
has a limitation that the backups are only compatible with RMAN but this should not be considered as a
limitation because RMAN is very robust and integrates with the Oracle server seamlessly. RMAN has seen
significant improvements since its release with Oracle 8, which was about 7 years back.

About RAC:

As mentioned, RAC is important for any implementation which is considering „High Availability and Scalability
as its important features. Oracle 10g with ASM is now a preferred choice for many implementations because
of its amazing features of IO balancing, ease of administration and cost implications. Like any non-clustered
database RAC also has a datafiles, control files and server parameter file but maintains undo segments and redo
log files per instance of database. Though, the redo log files are used by individual instances of RAC it is
recommended that they are placed on a shared location to enable the other nodes to read the log files in case of
instance recovery or database backups.

In any RAC environment all files i.e. datafiles, controlfiles, archived log and spfiles required to be backed up
are stored on a shared storage thus making it somewhat similar to any non-RAC implementation .In fact, the
backup and recovery concepts are similar to any single-instance Oracle database .The difference is only in the
implementation of the common shared location for the archived log files .It is highly recommended that it
                                                                                  A common shared OCFS or
                                                                                   for successfully executing will
should be placed on a common shared storage which is accessible by all nodesASM location which the
backup and allowing oracle to do instance recovery .                              enable all the nodes to read
                                                                                and write at the time of
                                                                                 backup and recovery .

The above figure shows that the three nodes of this RAC implementation share a common location /arc_dest.
This shared location could be an OCFS or ASM .The parameters which needs to be set here are:

*. log_archive_dest_1=/arc_dest  common location for archived logs for all nodes
*.log_archive_format=arch%s_%t_%r.log  the format has %t which help in identifying the thread .

In addition to these files there are two other files which are critical to any RAC implementation which are the
Oracle Cluster Registry (OCR) and the Voting Disk .At present RMAN is unable to do a backup for these files
but Oracle 10g R2 provides the option of mirroring them .The OCR is the most critical component of RAC
and a automatic backup is done by the CRS (Cluster Ready Services) every four hour period.

Shared Storage :

                                                                                                   Paper 212
                                                                          Architecture and Infrastructure

Automatic Storage Management (ASM) is another revolutionary feature of Oracle 10g .It is primarily intended
to help the DBAs managing either a Real Application Cluster (RAC) database or a non-RAC database but ASM
really helps RAC installations as it simplifies the storage issue which was addressed by OCFS or raw devices
until Oracle 9i. It was a difficult decision to choose shared storage if one was not on Linux or Windows as
OCFS was only available on these operating systems. Since, ASM is now available on all operating systems, it is
now apparent and an easy choice for RAC deployments as the performance is as good as raw devices.
However, one can still use OCFS especially for shared binaries or trace files but ASM is the obvious choice for
datafiles, control files, and log files.

Since it is very easy to install and hardly requires any resource or administration,DBAs and System
Administrators keep admiring the features of ASM. Without ASM plenty of manual intervention and downtime
is required to evenly stripe any data across the files if proper planning has not been done much in advance.

Starting Oracle 10g, Oracle can handle two types of instances, RDBMS and ASM .This is controlled by the
init.ora parameter instance_type. ASM instances do not have Data Dictionary and a 64MB SGA is a good start
.It may be prudent to install ASM in its own ORACLE_HOME as we can apply patches or upgrades
separately as it is not mandatory to have ASM and its client databases on the same release .

Some of the features of ASM are listed below:

        ASM is primarily designed to simplify the life of Storage Admins , System Admins and DBAs .
        It is highly optimized and can be used for any clustered or non-clustered database .It is because of this
         we observe that Oracle Cluster Synchronization Service Daemon is also present in a non-clustered
         database to support ASM disks.
        Oracle Managed Files (OMF) brings more value to ASM as deleting files in ASM is not as simple
         when compared to any other file system but if OMF is used then files are deleted automatically .
        A database can use all or one of the many options available to store its files i.e. either RAW, OCFS,
         ASM or any other file system or a combination of these.
        It is transparent to the end users and developers, but can only be backed up using RMAN, which
         again is not a hindrance as RMAN is the preferred and recommended choice for backup for any
         Oracle database.
        ASM is another step towards GRID computing. ASM provides high performance because it supports
         direct IO and asynchronous IO
        No additional skill needs to be acquired by anyone to manage an ASM Instance .Each node in a RAC
         environment will need at least one instance of ASM .The ASM instance has specific init.ora
         parameters which start with asm_ .These parameters can only be configured in an ASM instance .

RMAN Architecture :

RMAN was introduced with Oracle 8 and with every release of Oracle since then has seen significant
improvements. Prior to Recovery Manager ,people were using hot backups that had its own inherent problems
of splits blocks . The usage of RMAN becomes more valuable as it comes free with the Oracle Software and
the integration with Oracle is unimaginable. RMAN is a tool written in PL/SQL that has both command line
and GUI interface which interact with the server process running on the database server and performs a variety
of backup and recovery operations.

Typically ,before the introduction of RMAN we use to do our backups by placing the database in the backup
mode (alter tablespace begin backup) and use an OS utility to copy the datafiles before putting the tablespace
back in the normal mode (alter tablespace end backup) . During this time if the blocks changed then there is
substantial increase in the generation of redo logs to avoid „split blocks‟ in the backups. Though this user
managed backup is still fully supported and can be used .

                                                                                                     Paper 212
                                                                         Architecture and Infrastructure

To avoid Split Blocks in backups Oracle has to now write the complete image of any block instead of writing
only the changed records while in normal operation which is primarily due to the difference in the block size of
the operating system and Oracle .While in hot backup mode we typically use an OS utility like copy, dd or cpio
while oracle performs its IO using the db_block_size .

Moreover, these backups were not intelligent enough to keep a track of changed blocks and each backup used
to copy all the blocks regardless of any change between any two backups. This was very time and resource
consuming .

With the introduction of RMAN all these issues were addressed easily and tablespaces need not be put in hot
backup mode anymore. Since RMAN uses an Oracle server process to read the Oracle blocks (not an OS
utility) , checks to see whether the block is fractured by comparing control information stored in the header
and footer of each block. If a fractured block is detected, now the Oracle server process rereads the block thus
eliminating the need of putting tablespaces in hot backup mode and increasing the amount of redo log

RMAN writes its backup to an exclusive backup format called backup set .A backup set can have many backup
pieces .One backup task which can be the backup of a tablespace or database or archived logs can have more
than one backup set .But only RMAN can read from these backup sets while performing recovery .The
backups of datafiles and archived logs can not be in the same backup set .By default ,any backup set will
contain 4 or fewer datafiles or 16 or fewer archived logs and RMAN creates all backups as backup set . If using
tapes RMAN can only use backup sets to store backups .The size of any backup set can be configured by
MAXSETSIZE .RMAN inherently backs up only the used blocks and will never attempt to copy never-used
blocks which reduces the over-all size of the backups .Starting with Oracle 10g ,one can use binary
compression to further compress the backups of datafiles and archived redo logfiles by merely adding AS
COMPRESSED BACKUPSET to the backup commands .No special commands are required to restore this
compressed backup set as RMAN is aware about this compression .The compression of backups is only
recommended for disk backups .One should always keep in mind the extra CPU cycles required during backup
and restore .But in a RAC environment where we can distribute the backups between multiple hosts the
impact will be minimal.

In addition to backup sets RMAN can also create Image copy (ie bit for bit copy) and use them for recovery
but only on disks .If one performs a backup outside RMAN then it has to be registered with the repository
using the catalog command .While doing this RMAN also checks for any block level corruption which an OS
utility is unable to perform .Since this is not in an exclusive format RMAN may not be used for recovery .

channel ORA_DISK_1: datafile copy complete  Image copy
channel ORA_DISK_1: backup set complete  Backup Set

RMAN prefers to use Image copies while recovering as the recovery will be faster .

RMAN backups can be incremental or FULL .Incremental backups can only be done for datafiles and capture
the changes made to each of the blocks when compared to a base level backup (Level 0) .This results in a
smaller backup set unless every block has seen a change . The only limitation here is that RMAN has to read all
the blocks to make the selection of the blocks to be copied to the incremental backup by comparing SCN
(System Change Number) in the block header with the SCN of the parent backup ie Level 0 . This can be an
issue for large databases .To overcome this, Oracle 10g has introduced a new background process called
Change Tracking Writer (CTWR ) which will keep track of all the changed blocks in a file and incremental
backup will backup only the blocks written in this file .We will discuss more about this in the configuration
section .
Incremental backups become very useful after the creation of objects with NOLOGGING as they are not
logged in the redo logs hence error out incase of any recovery required without any incremental backups .It is
an interesting fact that incremental backups can also be performed on databases running in
NOARCHIVELOG mode .Incremental backups can be cumulative or differential (default ) .Cumulative
backups are more efficient while performing recovery . Starting Oracle 10g ,only incremental level of 1 and 0
are deprecated .

                                                                                                   Paper 212
                                                                            Architecture and Infrastructure

Which Files ?

We can backup and restore control file ,spfile ,archived redo logfiles and datafiles using RMAN .Even though
we have more than one instance in a RAC environment spfile ,controlfile and datafiles are common to all
instances and should be placed on shared storage .Each instance has its own thread of redo logs which should
also be on shared storage to enable easy access for other instances while doing backup and recovery .It is worth
mentioning here that archived logs have to be on a file system and cannot be archived to a raw device by the
archiver .

Currently in Oracle RAC environment ,the backup of Oracle Cluster Registry (OCR) ,Voting Disk ,listener.ora
and tnsnames.ora files are not supported by RMAN and one has to depend upon some other methods of
backup for these files .Moreover ,these files do not change very often and can be maintained by any OS utility
.Starting Oracle 10gR2 , the OCR and voting Disk can be mirrored .

Enabling Block Change Tracking for Fast Incremental Backups :

As mentioned earlier ,we can now track the changed blocks to make the incremental backups faster .The size of
this file is very small compared to the database .RMAN keeps a record of 8 previous backups .A file of 60MB is
enough for a database of 1 TB having two threads of redo logs ie two nodes in a RAC. Please refer to metalink
note Note:306112.1 for more details about sizing of this file .RMAN will automatically use this file for backups
but is unable to backup this file at this time.

SQL> alter database enable block change tracking using file '/usr01/oradata/WMST/WMST.ctwr';

This confirms that the background process has started .

racnode1:oracle> ps -ef |grep -i CTWR |grep -v grep
oracle 1512 1 0 09:06 ?          00:00:01 ora_ctwr_WMST1

Please note that /usr01 should be a shared location so that all the nodes can write to this file .

column filename format a30
select * from v$block_change_tracking;

STATUS FILENAME                                 BYTES
---------- ------------------------------ -----
ENABLED /usr01/oradata/WMST/WMST.ctwr 11599

The changed block tracking has brought more value to incremental backups as they have become faster .

Flash Recovery Area (FRA)

This should not be confused with the Flashback Database of Oracle 10g . This area holds all the important
backup and recovery related files on disk . FRA when configured with a good retention policy will
automatically handle the space management in this area on the disk. FRA is targeted towards the continuously
reducing prices of disks .

To configure FRA issue the following commands from a sqlplus session.

alter system set db_recovery_file_dest_size=10G scope=both sid='*';
alter system set db_recovery_file_dest='/usr06/FRA' scope=both sid='*';
alter system set db_recovery_file_dest='+BACKUPA' scope=both sid='*'; --if using ASM

                                                                                                     Paper 212
                                                                          Architecture and Infrastructure

Depending upon the retention policy RMAN will declare a backup as OBSOLETE .Oracle will automatically
handle this .If FRA is not used then we have to handle this manually . Once the files in the FRA have been
copied to tape then they are internally placed on a 'Files to be Deleted list' (v$recovery_file_dest.reclaimable).
Now, Oracle will automatically remove files from FRA whenever space is required in the FRA .Once you copy
the FRA (backup recovery area ) to tape all the space is now reclaimable by the backup and will we consumed
by RMAN whenever required .RMAN will not remove them from FRA until the space is required by future
backups .The primary objective here is to keep the backups on the disks so that minimal time is lost during
discovery. The following query shows that the FRA area has been backed up to tape using the command
backup recovery area and now all the space can be used if required.

COLUMN NAME format a20
SELECT * FROM v$recovery_file_dest
-------------------- ----------- ---------- ----------------- ---------------
/usr06/FRA               10737418240 4091265024                 4091265024    16
1 row selected

select * from v$flash_recovery_area_usage;

------------ ------------------ ------------------------- ---------------
CONTROLFILE                           0                     0            0
ONLINELOG                            0                    0             0
ARCHIVELOG                          .94                    .2            8
BACKUPPIECE                       12.26                   12.26             12
IMAGECOPY                        50.39                  50.39              13
FLASHBACKLOG                            0                     0            0

FRA cannot be stored on a raw file system .In RAC, the FRA should be on a Cluster File System or ASM .The
location and quota must be same on all the instances. If the LOG_ARCHIVE_DEST_n is not set then
LOG_ARCHIVE_DEST_10 is automatically set to FRA and archived logs are sent here as shown below:

SQL> archive log list ;
Database log mode           Archive Mode
Automatic archival         Enabled
Archive destination       USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 11
Next log sequence to archive 13
Current log sequence        13

Incrementally Updated Backups : This is another Oracle10g feature to minimize the recovery time .It creates an
image copy by applying the changes from the level 1 backup to the image copy of the level 0 incremental
backup .During the time of recovery this image copy along with the redo generated after the last incremental
backup will complete the restore in a shorter time compared to the previous practice of Level 0 + Level 1 +
archived Logs . It is mandatory for IUB to use image copies and Tags .

IUB applies incremental backups by using TAG to the specified image copy to roll it forward .The
RECOVER COPY command for updating a image copy and has nothing to do with database recovery. It is
recommended to configure RMAN to create image copies while backing up to disk .


RMAN>         RUN {
         DATABASE plus archivelog delete input ; }

                                                                                                     Paper 212
                                                                           Architecture and Infrastructure

The above syntax will give a warning (no parent backup or copy of datafile 1 found ) if no level 0 backup exists
with the TAG „INCR_TAG‟ but will create a level 0 backup .The second time again It will give a warning (no
copy of datafile 1 found to recover) because no level 1 backup is available to be applied to the Level 0 image
copy . However , from the third time onwards it will execute without any warnings and will create a ready for
use image copy .

If one decides to start with a level 0 again then simply change the TAG.

FRA and Incrementally Updated Backups are primarily targeted at the continuously decreasing price of the
disk to make the recovery faster by keeping them readily available on the disk .

It is very simple to backup the FRA which is done with only one command –


It would have been very difficult to write a script to handle the work done by the scheduler .This package has
been introduced with Oracle 10g and will replace the earlier dbms_jobs which had plenty of limitations .Infact
the limitations of dbms_jobs are more obvious after the introduction of dbms_scheduler .This package also
supports the use of unix shell scripts .These scripts should have the appropriate permission and placed on
ashared directory so that any of the active node execute the scripts .It also has a stickiness_bit and depending
upon the setting and workload on the system the scheduler will kick off the pl/sql or unix shell script on the
appropriate node.

RMAN in a different Role :
Transportable Tablespaces have played a significant role in moving data from one database to another .It was
introduced with Oracle 8i and now in 10gR2 all the limitations have been removed .We can now move
tablespaces from one operating system to another provided they share the same endianess and need not make
them read only . All this online movement of data is done by RMAN .

RMAN and Enterprise Manager :

RMAN has been tightly integrated with EM .The use of EM is encouraged as it is very
simple to use.The most promising feature is the Oracle Backup Strategy which you can start
using for your database within minutes .It uses the block change tracking ,incremental
backups and incrementally updated backups (image copies) and schedules the backup as
requested .
The use of OEM will become more popular day by day as starting Oracle 10gR2 OEM has a SGA Direct
Attach feature that allows Oracle Enterprise Manager to attach directly to the SGA and query relevant
performance information – even when you cannot yourself log into the database. Prior releases of Oracle
could do this via arcane commands and “oradebug”, but it was typically done at the request of support to
provide them with diagnostic information. Now you can use the same technique via Enterprise manager to
give you yourself the ability to diagnose performance or “hung database” issues.

Automatic Allocation of Channels in RAC

To complete a backup or restore activity faster within a stipulated time we allocate multiple channels . Each of
these channels execute different parts of the work .We can now complete a job even if a channel has failed
during the execution to complete its task .The next available channel will complete the work after completing
its own work .This becomes very significant in a RAC environment .When a instance fails during a backup then
this backup task is completed using another channel from any active instance of the database .The errors can
be viewed using the v$rman_output .

                                                                                                   Paper 212
                                                                             Architecture and Infrastructure

Starting Oracle10gR2 ,RAC can automatically allocate channels depending upon the workload on the instances

Backup Examples :
Once all the configuration is complete we are now ready to script our Backups .It is very important to design a
good backup plan and test it regularly .It is very much possible that a backup plan designed for one database
may be appropriate for the other databases because while doing so one has to look into the redo log generation
,availability ,size ,importance of the database to the business .Well ,since we are talking about high availability it
is assumed that the database is very important for the business .

A typical backup plan could be an Incremental Level 0 backup on Sunday , a cumulative (level 1) backup every
day and a mid-day differential (level 1 ) backup .The advantage here will be that the differential backup will go
against the previous day level 1 backup .The archived logs can be backed up more frequently and the input can
be deleted if space is an issue .

“Backup plus archivelog” has a different algorithm --This command will backup the database and all the
archivelogs generated by all the instances of RAC during the duration of the backup .

“Regardless of the High Availability feature of RAC no backup strategy is complete unless the recovery has
been extensively practiced, tested and properly documented “

While performing recovery RMAN will choose the backupset which it feels will be the most efficient eg
RMAN will prefer choosing an incremental backup over archived logs because applying all changes at once to
any block is much more efficient when compared to applying individual changes to any block from multiple
archive logs files.

A Typical Configuration


A Backup script

Script Name
# This script will call two scripts and depending upon the node of execution
# This script is then finally called by dbms_scheduler and will automatically execute from one of the available
# Currently ,it uses the control file as the recovery catalog for HA.

                                                                                                        Paper 212
                                                                Architecture and Infrastructure

# One should catalog the backups to resync with the recovery catalog.
export ORACLE_BASE=/oracle/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/10g
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib:/usr/local/lib
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
export LIBPATH=/oracle/app/oracle/product/10g/lib
export LGNAME=${ORACLE_SID}_logs`date +%j%H`.log
export LGDIR=/tmp
export SCRIPTS=/home/oracle/final/bk
#SID=`echo $1 |tr [a-z] [A-Z]`
if [ "${host}" = "racnode3" ];
export ORACLE_SID=${SID}1
rman target / msglog ${LGDIR}/${LGNAME} cmdfile ${SCRIPTS}/
if [ "${host}" = "racnode4" ];
export ORACLE_SID=${SID}2
rman target / msglog ${LGDIR}/${LGNAME} cmdfile ${SCRIPTS}/
grep 'RMAN-00569' ${LGDIR}/${LGNAME} > /dev/null 2>&1
if [ $? -eq 0 ];then
mail -s "Backup Failed on $ORACLE_SID" < ${LGDIR}/${LGNAME}
mail -s "Backup Sucessful on $ORACLE_SID" < ${LGDIR}/${LGNAME}

Script Name
{allocate channel t1 type 'SBT_TAPE' parms
allocate channel t2 type 'SBT_TAPE' parms
backup incremental level 0
filesperset 4
format 'LEVEL0%d%s%p%T'
plus archivelog format 'logs_%d_%T_%u_%s' delete input;
release channel t1;
release channel t2;

Scheduling the above script using DBMS_SCHEDULER to execute every day at 6AM
  ( job_name=>'LEVEL0_BACKUP',
    repeat_interval=>'FREQ=DAILY; byhour=6;byminute=0;bysecond=0;',

                                                                                    Paper 212
                                                                          Architecture and Infrastructure

  comments=>'Daily Level 0 Incremental Backup');

Cloning a database using the duplicate command :Currently it is not possible to clone a RAC database so it
needs to be cloned as a single-instance database then converted into a clustered database .The steps required
are listed below .

              Create and Copy the Pfile to the desired host.
              Make the required changes --
                      –mkdir –p …udump ,.. bdump,….,cdump
                      –Add the auxiliary instance to tnsnames.ora
                      –Depending upon the config add in the listener.ora file
                      –Create a password file
            • Startup nomount using this new pfile –auxiliary instance is ready
            • From one of the RAC nodes :
                      –rman target / auxiliary sys/xxxx@yyy
                      –Duplicate target database to …..
                      –Starting 10gR2, from alert log “Re-creating tempfile +USR01 as
            •Alter database add logfile thread 2
                       group 3 ('+USR01/PICKY/ONLINELOG/redo3.log') size 25m,
                       group 4 ('+USR01/PICKY/ONLINELOG/redo4.log') size 25m;
            •Alter database enable public thread 2
            •Start the other instance from the other node
            •Register this database in the OCR
                      –racnode4:oracle> srvctl add database -d PICKY -o $ORACLE_HOME
                      –racnode4:oracle> srvctl config
                      –racnode4:oracle> srvctl add instance -d PICKY -i PICKY1 -n racnode3
                      –racnode4:oracle> srvctl add instance -d PICKY -i PICKY2 -n racnode4


The growth in the number of implementation of Real Application Cluster installations is increasing rapidly.
The nature of the application deployed on these RAC installations are very critical for the business . RMAN is
an obvious choice for all the backup needs in a RAC environment as it integrates strongly without any extra
cost. It is needless to mention that Backup and Recovery are important aspects of Database Administration and
proficiency is expected while performing recovery of any database, as the impact of this failure cannot be
comprehended .It is always expected that the person performing this task will have clear conceptual knowledge
of all the pieces required for a successful backup and recovery. This session will discuss the different real-life
scenarios of backup and recovery in a RAC environment which are also applicable to any non-RAC database.
Each of the examples will also highlight the differences if the database was a non-clustered database .All the
examples covered in this session will use RMAN . This session will help in implementing backup and recovery
strategy with RMAN in any clustered environment.

                                                                                                     Paper 212

To top