Session id: 40136
RAC Best Practices on Linux
Kirk McGowan Roland Knapp
Technical Director – RAC Pack Principal Member Technical Staff –
Server Technologies RAC Pack
Oracle Corporation Server Technologies
Oracle Corporation
Agenda
Planning Best Practices
– Architecture
– Expectation setting
– Objectives and success criteria
– Project plan
Implementation Best Practices
– Infrastructure considerations
– Installation/configuration
– Database creation
– Application considerations
Operational Best Practices
– Backup & Recovery
– Performance Monitoring and Tuning
– Production Migration
Planning
Understand the Architecture
– Cluster terminology
– Functional basics
HA by eliminating node & Oracle as SPOFs
Scalability by making additional processing capacity
available incrementally
– Hardware components
Private interconnect/network switch
Shared storage/concurrent access/storage switch
– Software components
OS, Cluster Manager, DBMS/RAC, Application
Differences between cluster managers
RAC Hardware Architecture
Centralized Network Users
Management
Console
High Speed
Switch or
Low Latency Interconnect No Single
Interconnect Point Of Failure
ie. VIA or Proprietary
Clustered
Database
Servers
Hub or
Switch Fabric Storage Area Network
Mirrored Disk
Subsystem
RAC Software Architecture
Shared Data Model
GES&GCS GES&GCS GES&GCS GES&GCS
Shared Memory/Global Area Shared Memory/Global Area Shared Memory/Global Area Shared Memory/Global Area
shared log shared log
. . .. . . shared log shared log
SQL buffer SQL buffer SQL buffer SQL buffer
Shared Disk Database
RAC on Linux
HW & SW Components
public network
Node1a Node2a
more nodes
Oracle 9i RAC cluster Oracle 9i RAC = higher
instance 1 interconnect instance 2 availability
DB cache DB cache N3 N4 Nn
cache to
ORACM cache ORACM
Unbreakable Linux Unbreakable Linux
shared storage
concurrent redo log instance 1 …
access from redo log instance 2 …
every node = control files
“scale out”
database files
Linux Cluster Hardware
Cluster interconnects
– FastEthernet, Gigabit Ethernet
Public networks
– Ethernet, FastEthernet, Gigabit Ethernet
Memory, swap & CPU Recommendations
– Each server should have a minimum of 512Mb of memory,
at least 1Gb swap space, and two CPUs.
Fiber Channel, SCSI, or NAS storage connectivity
Unbreakable Linux Distributions
Red Hat Enterprise Linux AS and ES
United Linux 1.0
– SuSE Linux Enterprise Server 8 (SuSE Linux AG)
– Conectiva Linux Enterprise Edition (Conectiva S.A.)
– SCO Linux Server 4.0 (The SCO Group)
– Turbolinux Enterprise Server 8 (Turbolinux)
Oracle will support Oracle products running with other
distributions but will not support the operating system.
RAC Certification for Unbreakable
Linux
Certification
– Enterprise class OS distribution (e.g. RH AS, United Linux
1.0)
– Clusterware (Oracle Cluster Manager only)
– Network Attached Storage (e.g. Network Appliance filers)
– Most SCSI and SAN storage are compatible
– 32 bit and 64 bit Itanium 2 Intel based servers are certified.
For more details on software certification:
http://technet.oracle.com/support/metalink/content.html
Discuss hardware configuration with your HW vendor
Linux IA64 requirements
Operating System Requirements
– Red Hat Linux Advanced Server 2.1 operating system with
kernel 2.4.18-e.14.ia64.rpm
– glibc 2.2.4-29
– Gnu gcc 2.96.0 release
– Linux Header Patch 2.4.18 (available from Intel)
– asynch libraries libaio-0.3.92-1
– (Oracle9i Release Notes
Release 2 (9.2.0.2.0) for Linux Intel on Itanium (64-bit)
Part No. B10567-02 )
Set Expectations Appropriately
If your application will scale transparently on
SMP, then it is realistic to expect it to scale well on
RAC, without having to make any changes to the
application code.
RAC eliminates the database instance, and the
node itself, as a single point of failure, and ensures
database integrity in the case of such failures
Planning: Define Objectives
Objectives need to be quantified/measurable
– HA objectives
Planned vs unplanned
Technology failures vs site failures vs human errors
– Scalability Objectives
Speedup vs scaleup
Response time, throughput, other measurements
– Server/Consolidation Objectives
Often tied to TCO
Often subjective
Build your Project Plan
Partner with your vendors
– Multiple stakeholders, shared success
Build detailed test plans
– Confirm application scalability on SMP before going to RAC
optimize first for single instance
Address knowledge gaps and training
– Clusters, RAC, HA, Scalability, systems management
– Leverage external resources as required
Establish strict System and Application Change control
– Apply changes to one system element at a time
– Apply changes to first to test environment
– Monitor impact of application changes on underlying system
components
Define Support mechanisms and escalation procedures
Agenda
Planning Best Practices
– Architecture
– Expectation setting
– Objectives and success criteria
– Project plan
Implementation Best Practices
– Infrastructure considerations
– Installation/configuration
– Database creation
– Application considerations
Operational Best Practices
– Backup & Recovery
– Performance Monitoring and Tuning
– Production Migration
Infrastructure Considerations
Architecture/Design
– Eliminate SPOFs (Single Points of Failure)
– Workload Distribution (load balancing) strategy
– Systems management framework for monitoring and managing to
SLAs
Hardware/Software
– Processing nodes – sufficient CPU to accommodate failure
– Scalable I/O Subsystem
Use S.A.M.E.
– Private Interconnect
Gige, UDP, switched
– Patch levels and certification
Impementation Flowchart
Configure HW Install cluster manager 9.2.0.1 Create database
Configure private
Install Oracle 9.2.0.1
interconnect
Install Unbreakable Linux Install 9.2.0.3 cluster manager
Configure storage and install
Install Oracle 9.2.0.3
OCFS
Installation Flowchart for Red Hat
Linux AS 2.1
Use DRUID for Account
Boot
Partition Setup Configuration
Choose Language Select Boot Select Graphic
Loader Mode
Select Keyboard Configure Boot Floppy
& Mouse Network Creation
Choose – Advanced Configure Installation
Server Option Timezone Complete / Reboot
Install tips for Red Hat Linux AS 2.1
As documented in:
– “Tips and Techniques: Install and Configure Oracle9i on
Red Hat Linux Advanced Server” by Deepak Patel, Oracle
http://otn.oracle.com/tech/linux/pdf/installtips_final.pdf
Boot options
– Always use Advanced Server install. As needed install
required packages. CD 1 to 3 has all rpm packages. CD 3
and 4 has source packages. CD 5 includes docs.
Memory
– Based on physical memory on machine smp or enterprise
kernel is installed. ( 4 GB
enterprise kernel )
Post Installation
– Add users, configure network and other administrative
tasks after installation.
Install tips for United Linux 1.0
You must install the latest UnitedLinux kernel update!
Oracle was certified against an update kernel, the
original UL-1.0 kernel is NOT certified!
After installing United Linux 1.0, install Service Pack
2a from:
ftp://suse.us.oracle.com/pub/suse/i386/unitedlinux-1.0-iso/
You will also need to have the basic developments
tools installed, like make, gcc_old(2.95.3), and the
binutils package.
Full installation instructions:
ftp://ftp.suse.com/pub/suse/i386/supplementary/commercial/Orac
le/docs/920_sles8_install.pdf
Install tips for United Linux 1.0
Install the orarun.rpm package from either the
SP2 CD
– /UnitedLinux/i586/orarun-1.8-18.i586.rpm
or from
– ftp://ftp.suse.com/pub/suse/i386/supplementary/commerci
al/Oracle/sles-8/orarun.rpm
orarun.rpm
update kernel (ie. shmmax, shmmin)
UDP settings (256K)
Installs and configures hangcheck-timer
Prepare Linux Environment
Follow these steps on EACH node of the
cluster
– Set Kernel parameters in /etc/sysctl.conf
– Add hostnames to /etc/hosts file
– Establish file system or location for ORACLE_HOME
(writable for oracle userid)
– Setup host equivalence for oracle userid (.rhosts)
Installation Flowchart for OCFS
Download the latest
Create partition on the
OCFS rpm’s from
primary node
www.ocfs.org
Run ocfstool to format and
Install the rpm’s on all nodes
mount your new filesystem
Run ocfstool as root
Mount the new filesystem
(configures /etc/ocfs.conf)
on all nodes
on all nodes
Run load_ocfs Edit rc.local or equivalent add
(insmod will load ocfs.o) load_ocfs and ‘mount –t
on all nodes ocfs select * from gv$instance
RAC communicating over the private Interconnect
SQL> oradebug setmypid
SQL> oradebug ipc
SQL> oradebug tracefile_name
/home/oracle/admin/RAC92_1/udump/rac92_1_ora_1343841.trc
– Check trace file in the user_dump_dest:
SSKGXPT 0x2ab25bc flags info for network 0
socket no 10 IP 204.152.65.33 UDP 49197
sflags SSKGXPT_UP
info for network 1
socket no 0 IP 0.0.0.0 UDP 0
sflags SSKGXPT_DOWN
RAC is using desired IPC protocol: Check Alert.log
...
cluster interconnect IPC version:Oracle UDP/IP
IPC Vendor 1 proto 2 Version 1.0
PMON started with pid=2
...
Use cluster_interconnects only if necessary
Configure srvconfig / srvctl
SRVCTL uses information from srvconfig
– Reads $ORACLE_HOME/srvm/config /srvConfig.loc
information
File can be a RAW Device or OCFS file
Srvconfig -init
gsd must be running
Add ORACLE_HOME
– $ srvctl add database -d db_name -o oracle_home [-m
domain_name] [-s spfile]
Add instances (for each instance enter the command)
– $ srvctl add instance -d db_name -i sid -n node
Application Deployment
Same guidelines as single instance
– SQL Tuning
– Sequence Caching
– Partition large objects
– Use different block sizes
– Tune instance recovery
– Avoid DDL
– Use LMT‟s and ASSM as noted earlier
Agenda
Planning Best Practices
– Architecture
– Expectation setting
– Objectives and success criteria
– Project plan
Implementation Best Practices
– Infrastructure considerations
– Installation/configuration
– Database creation
– Application considerations
Operational Best Practices
– Backup & Recovery
– Performance Monitoring and Tuning
– Production Migration
Operations
Same DBA procedures as single instance, with some
minor, mostly mechanical differences.
Managing the Oracle environment
– Starting/stopping cluster services (ocmstart.sh)
– Starting/stopping gsd
– Managing multiple redo log threads
Startup and shutdown of the database
– Use srvctl
Backup and recovery
Performance Monitoring and Tuning
Production migration
Operations: srvconfig / srvctl
Use SRVCTL to administer your RAC database
environment.
– OEM and the Oracle Intelligent Agent use the configuration
information that SRVCTL generates to discover and
monitor nodes in your cluster.
Global Services Daemon (GSD) receives requests
from SRVCTL to execute administrative job tasks,
such as startup or shutdown.
– GSD must be started on all the nodes in your RAC
environment so that the manageability features and tools
operate properly. (GSDCTL)
Operations: Backup & Recovery
RMAN is the most efficient option for Backup &
Recovery
– Managing the snapshot control file location.
– Managing the control file autobackup feature.
– Managing archived logs in RAC – choose proper archiving
scheme.
– Node Affinity Awareness
RMAN and Oracle Net in RAC apply
– you cannot specify a net service name that uses Oracle
Net features to distribute RMAN connections to more than
one instance.
Oracle Enterprise Manager
– GUI interface to Recovery Manager
Performance Monitoring and Tuning
Tune first for single instance 9i
Use Statspack:
– Separate 1 GB tablespace for Statspack
– snapshots at 10-20 min intervals during stress testing, hourly during
normal operations
– Run on all instances, staggered
Supplement with scripts/tracing
– Monitor V$SESSION_WAIT to see which blocks are involved in
wait events
– Trace events like 10046/8 can provide additional wait event details
– Monitor Alert logs and trace files, as on single instance
Oracle Performance Manager
RAC-specific views
Supplement with System-level monitoring
– CPU utilization never 100%
– I/O service times never > acceptable thresholds
– CPU run queues at optimal levels
Performance Monitoring and Tuning
Obvious application deficiency on a single node can‟t be solved
by multiple nodes.
– Single points of contention.
– Not scalable on SMP
– I/O bound on single instance DB
Tuning on single instance DB to ensure applications scalable
first
– Identify/tune contention using v$segment_statistics to identify
objects involved
– Concentrate on the top 5 Statspack timed events if majority of
time is spent waiting
– Concentrate on bad SQL if CPU bound
Maintain a balanced load on underlying systems (DB, OS,
storage subsystem, etc. )
– Excessive load on individual components can invoke aberrant
behaviour.
Performance Monitoring and Tuning
Deciding if RAC is the performance bottleneck
– Amount of Cross Instance Traffic
Type of requests
Type of blocks
– Latency
Block receive time
buffer size factor
bandwidth factor
Production Migration
Adhere to strong Systems Life Cycle Disciplines
– Comprehensive test plans (functional and stress)
– Rehearsed production migration plan
– Change Control
Separate environments for Dev, Test, QA/UAT,
Production
System AND application change control
Log changes to spfile
– Backup and recovery procedures
– Security controls
– Support Procedures
Next Steps….
Recommended sessions
– List 1 or 2 sessions that complement this session
Recommended demos and/or hands-on labs
– List of or two demos or labs that will let them see this
product in action.
See Your Business in Our Software
– Visit the DEMOgrounds for a customized architectural
review, see a customized demo with Solutions Factory, or
receive a personalized proposal. Visit the DEMOgrounds
for more information.
Relevant web sites to visit for more information
– List urls here.
Reminder –
please complete the OracleWorld
online session survey
Thank you.
Resources
RedHat Linux
– http://www.redhat.com/oracle/
Linux Center - Technical White Papers & Documentation
– http://otn.oracle.com/tech/linux/tech_wp.html
“Tips and Techniques: Install and Configure Oracle9i on Red
Hat Linux Advanced Server” by Deepak Patel, Oracle
Corporation
• http://otn.oracle.com/tech/linux/pdf/installtips_final.pdf
“Tips and Techniques: Install and Configure Oracle9i on
SLES8 / United Linux 1.0
• http://www.suse.com/en/business/certifications/certified_software/oracle/db
/9iR2_sles8.html
United Linux 1.0 Resources
United Linux
– http://www.unitedlinux.com
SuSE
– http://www.suse.com/us/business/products/server/sles/index.html
Connectiva
– http://www.connectiva.com
SCO Group (Formerly Caldera System)
- http://www.ebizenterprises.com/page1.asp?p=463
TurboLinux
– http://www.turbolinux.com/
Recommended one-off patches
Bug 2820871 - ORA-29740 NODE EVICTION
DESIGN ALGORITHM AND ABRUPT TIME
CHANGE
ARU: 9.2.0.3 ARU 4161735 completed for
LINUX Intel
Bug 2420930 - GET ORA-600 [KSXPMPRP1]
DURING STARTUP IN RAC MODE WITH
LARGER BUFFERS. This was mysteriously
included in 9.2.0.2, but not in 9.2.0.3. Bug
2875050 was opened for this issue.
ARU: 9.2.0.3 ARU 4202164completed for LINUX
Intel
Bug 2420930 - GET ORA-600 [KSXPMPRP1]
Recommended one-off patches
Bug:2844009 - MISSING LIBCXA.SO.3
LIBRARY ISSUE IN PSR 9203.
ARU: 9.2.0.3 ARU 4046387 completed for
LINUX Intel
Bug 2779294 – node_list does not populated into
oraInventory/ContentsXML/inventory.xml.
opatch install will only apply to local node.
Workaround is editing inventory.xml documented
in bug 2742686.
Bug 2646914, 2675090, 2706220 and 2695783 -
ORA-600 [KCCSBCK_FIRST], [2] on linux and
W2K platform after installing 9.2.0.2. Very
important patch, missing from 9.2.0.3
Hangcheck-timer and Oracle
Cluster Manager
Download Patch 2594820 from Metalink
– #rpm -ivh
Detaching watchdogd from the Cluster Manager (Bug
2495915)
The removal of the watchdogd
ORACLE_HOME/oracm/admin/cmcfg.ora
– WatchdogTimerMargin
– WatchdogSafetyMargin
KernelModuleName=hangcheck-timer
CMDiskFile from optional to mandatory
– CM quorum partition of cluster participation.
Hangcheck-timer and Oracle
Cluster Manager
remove or comment out from the /etc/rc.local
file:
/sbin/insmod softdog nowayout=0 soft_noboot=1
soft_margin=60
ADD to rc.local, execute as root to load
/sbin/insmod hangcheck-timer.o hangcheck_tick=30
hangcheck_margin=180
Hangcheck-timer and Oracle
Cluster Manager
inclusion of the hangcheck-timer kernel module
Parameter Service Value
----------------- ----------------- --------
-------
hangcheck_tick hangcheck-timer 30 seconds
hangcheck_margin hangcheck-timer 180 seconds
KernelModuleName oracm
hangcheck-timer
MissCount oracm >
hangcheck_tick
hangcheck_margin
Hangcheck-timer and Oracle
Cluster Manager
cmcfg.ora example
- HeartBeat=15000
- ClusterName=Oracle Cluster Manager, version 9i
- KernelModuleName=hangcheck-timer
- PollInterval=1000
- MissCount=215
- PrivateNodeNames=int-node1 int-node2
- PublicNodeNames=node1 node2
- ServicePort=9998
- CmDiskFile=/ocfsdisk1/quorum/quorumfile
- HostName=int-node1
Hangcheck-timer and Oracle
Cluster Manager
Parameters for ocmargs.ora
- oracm
- norestart 1800
Linux Monitoring and Configuration
Tools
- Overall tools sar, vmstat
- CPU /proc/cpuinfo, mpstat, top
- Memory /proc/meminfo, /proc/slabinfo, free
- Disk I/O iostat
- Network /proc/net/dev, netstat, mii-tool
- Kernel Version and Rel. cat /proc/version
- Types of I/O Cards lspci –vv
- Kernel Modules Loaded lsmod, cat /proc/modules
- List all PCI devices (HW) lspci –v
- Startup changes /etc/sysctl.conf, /etc/rc.local
- Kernel messages /var/log/messages, /var/log/dmesg
- OS error codes /usr/src/linux/include/asm/errno.h
- OS calls /usr/sbin/strace-p
Post Installation
Increasing Address Space
Default 1.7 GB of address space for its SGA.
Shutdown all instances of Oracle
cd $ORACLE_HOME/lib
cp -a libserver9.a libserver9.a.org
– to make a backup copy
cd $ORACLE_HOME/rdbms/lib
genksms -s 0x15000000 >ksms.s
– lower SGA base to 0x15000000
make -f ins_rdbms.mk ksms.o
– compile in new SGA base address
make -f ins_rdbms.mk ioracle (relink)
Post Installation
Increasing Address Space Cont.
sysctl –w kernel.shmmax=3000000000
Lower process base
– Find out the pid of the process (shell) from where
oracle will be started using ps (Oracle - echo $$)
– changing /proc/$pid/mapped_base to 0x10000000
and restarting oracle
Metalink Note: 200266.1
Post Installation
Default After Relink
0xFFFFFFFF 0xFFFFFFFF
Reserved for Reserved for
kernel kernel
0xC0000000 0xC0000000
Variable SGA Variable SGA
DB Buffers sga_base
(SGA) (relink Oracle) DB Buffers
(SGA)
0x50000000 mapped_base
0x40000000 (/proc//mapped_base)
Code, etc. 0x15000000
0x10000000 Code, etc.
0x00000000 0x00000000
Lowering of mapped base
Post Installation
Larger Buffer Cache does buffer cache increase with larger
SGA
Create an in-memory file system on the /dev/shm
mount -t shm shmfs -o size=8g /dev/shm
To enable the extended buffer cache feature, set the init.ora
paramter
USE_INDIRECT_DATA_BUFFERS = true
Don‟t Use dynamic cache parameters
DB_CACHE_SIZE
DB_#K_CACHE_SIZE
Limitations apply to the extended buffer cache feature on Linux:
You cannot change the size of the buffer cache while the instance is running.
Post Installation
Adjust send / receive buffer size to 256K
Tuning the default and maximum window sizes:
- /proc/sys/net/core/rmem_default - default receive
window
- /proc/sys/net/core/rmem_max - maximum receive
window
- /proc/sys/net/core/wmem_default - default send window
- /proc/sys/net/core/wmem_max - maximum send
window
- sysctl -w net.core.rmem_max=262144
- sysctl -w net.core.wmem_max=262144
- sysctl -w net.core.rmem_default=262144
Post Installation
To enable asynchronous I/O must re-link
Oracle to use skgaioi.o
– cd to $ORACLE_HOME/rdbms/lib
– make -f ins_rdbms.mk async_on
– make -f ins_rdbms.mk ioracle
– set 'disk_asynch_io=true' (default value is true)
– set 'filesystemio_options=asynch„ (RAW Only)