Embed
Email

40136_McGowan

Document Sample

Shared by: xiang
Categories
Tags
Stats
views:
3
posted:
11/9/2011
language:
English
pages:
59
Session id: 40136









RAC Best Practices on Linux



Kirk McGowan Roland Knapp

Technical Director – RAC Pack Principal Member Technical Staff –

Server Technologies RAC Pack

Oracle Corporation Server Technologies

Oracle Corporation

Agenda

 Planning Best Practices

– Architecture

– Expectation setting

– Objectives and success criteria

– Project plan

 Implementation Best Practices

– Infrastructure considerations

– Installation/configuration

– Database creation

– Application considerations

 Operational Best Practices

– Backup & Recovery

– Performance Monitoring and Tuning

– Production Migration

Planning



 Understand the Architecture

– Cluster terminology

– Functional basics

 HA by eliminating node & Oracle as SPOFs

 Scalability by making additional processing capacity

available incrementally

– Hardware components

 Private interconnect/network switch

 Shared storage/concurrent access/storage switch

– Software components

 OS, Cluster Manager, DBMS/RAC, Application

 Differences between cluster managers

RAC Hardware Architecture



Centralized Network Users

Management

Console



High Speed

Switch or

Low Latency Interconnect No Single

Interconnect Point Of Failure

ie. VIA or Proprietary



Clustered

Database

Servers



Hub or

Switch Fabric Storage Area Network





Mirrored Disk

Subsystem

RAC Software Architecture

Shared Data Model





GES&GCS GES&GCS GES&GCS GES&GCS



Shared Memory/Global Area Shared Memory/Global Area Shared Memory/Global Area Shared Memory/Global Area









shared log shared log

. . .. . . shared log shared log

SQL buffer SQL buffer SQL buffer SQL buffer









Shared Disk Database

RAC on Linux

HW & SW Components

public network

Node1a Node2a

more nodes

Oracle 9i RAC cluster Oracle 9i RAC = higher

instance 1 interconnect instance 2 availability

DB cache DB cache N3 N4 Nn

cache to

ORACM cache ORACM

Unbreakable Linux Unbreakable Linux



shared storage



concurrent redo log instance 1 …

access from redo log instance 2 …

every node = control files

“scale out”

database files

Linux Cluster Hardware



 Cluster interconnects

– FastEthernet, Gigabit Ethernet





 Public networks

– Ethernet, FastEthernet, Gigabit Ethernet





 Memory, swap & CPU Recommendations

– Each server should have a minimum of 512Mb of memory,

at least 1Gb swap space, and two CPUs.



 Fiber Channel, SCSI, or NAS storage connectivity

Unbreakable Linux Distributions



 Red Hat Enterprise Linux AS and ES

 United Linux 1.0

– SuSE Linux Enterprise Server 8 (SuSE Linux AG)

– Conectiva Linux Enterprise Edition (Conectiva S.A.)

– SCO Linux Server 4.0 (The SCO Group)

– Turbolinux Enterprise Server 8 (Turbolinux)

 Oracle will support Oracle products running with other

distributions but will not support the operating system.

RAC Certification for Unbreakable

Linux

 Certification

– Enterprise class OS distribution (e.g. RH AS, United Linux

1.0)

– Clusterware (Oracle Cluster Manager only)

– Network Attached Storage (e.g. Network Appliance filers)

– Most SCSI and SAN storage are compatible

– 32 bit and 64 bit Itanium 2 Intel based servers are certified.



 For more details on software certification:

http://technet.oracle.com/support/metalink/content.html

 Discuss hardware configuration with your HW vendor

Linux IA64 requirements



 Operating System Requirements

– Red Hat Linux Advanced Server 2.1 operating system with

kernel 2.4.18-e.14.ia64.rpm

– glibc 2.2.4-29

– Gnu gcc 2.96.0 release

– Linux Header Patch 2.4.18 (available from Intel)

– asynch libraries libaio-0.3.92-1

– (Oracle9i Release Notes

Release 2 (9.2.0.2.0) for Linux Intel on Itanium (64-bit)

Part No. B10567-02 )

Set Expectations Appropriately



 If your application will scale transparently on

SMP, then it is realistic to expect it to scale well on

RAC, without having to make any changes to the

application code.



 RAC eliminates the database instance, and the

node itself, as a single point of failure, and ensures

database integrity in the case of such failures

Planning: Define Objectives



 Objectives need to be quantified/measurable

– HA objectives

 Planned vs unplanned

 Technology failures vs site failures vs human errors

– Scalability Objectives

 Speedup vs scaleup

 Response time, throughput, other measurements

– Server/Consolidation Objectives

 Often tied to TCO

 Often subjective

Build your Project Plan

 Partner with your vendors

– Multiple stakeholders, shared success

 Build detailed test plans

– Confirm application scalability on SMP before going to RAC

 optimize first for single instance

 Address knowledge gaps and training

– Clusters, RAC, HA, Scalability, systems management

– Leverage external resources as required

 Establish strict System and Application Change control

– Apply changes to one system element at a time

– Apply changes to first to test environment

– Monitor impact of application changes on underlying system

components

 Define Support mechanisms and escalation procedures

Agenda

 Planning Best Practices

– Architecture

– Expectation setting

– Objectives and success criteria

– Project plan

 Implementation Best Practices

– Infrastructure considerations

– Installation/configuration

– Database creation

– Application considerations

 Operational Best Practices

– Backup & Recovery

– Performance Monitoring and Tuning

– Production Migration

Infrastructure Considerations



 Architecture/Design

– Eliminate SPOFs (Single Points of Failure)

– Workload Distribution (load balancing) strategy

– Systems management framework for monitoring and managing to

SLAs

 Hardware/Software

– Processing nodes – sufficient CPU to accommodate failure

– Scalable I/O Subsystem

 Use S.A.M.E.

– Private Interconnect

 Gige, UDP, switched

– Patch levels and certification

Impementation Flowchart

Configure HW Install cluster manager 9.2.0.1 Create database







Configure private

Install Oracle 9.2.0.1

interconnect







Install Unbreakable Linux Install 9.2.0.3 cluster manager







Configure storage and install

Install Oracle 9.2.0.3

OCFS

Installation Flowchart for Red Hat

Linux AS 2.1

Use DRUID for Account

Boot

Partition Setup Configuration







Choose Language Select Boot Select Graphic

Loader Mode







Select Keyboard Configure Boot Floppy

& Mouse Network Creation







Choose – Advanced Configure Installation

Server Option Timezone Complete / Reboot

Install tips for Red Hat Linux AS 2.1

 As documented in:

– “Tips and Techniques: Install and Configure Oracle9i on

Red Hat Linux Advanced Server” by Deepak Patel, Oracle

http://otn.oracle.com/tech/linux/pdf/installtips_final.pdf

 Boot options

– Always use Advanced Server install. As needed install

required packages. CD 1 to 3 has all rpm packages. CD 3

and 4 has source packages. CD 5 includes docs.

 Memory

– Based on physical memory on machine smp or enterprise

kernel is installed. ( 4 GB

enterprise kernel )

 Post Installation

– Add users, configure network and other administrative

tasks after installation.

Install tips for United Linux 1.0



 You must install the latest UnitedLinux kernel update!

Oracle was certified against an update kernel, the

original UL-1.0 kernel is NOT certified!

 After installing United Linux 1.0, install Service Pack

2a from:

ftp://suse.us.oracle.com/pub/suse/i386/unitedlinux-1.0-iso/

 You will also need to have the basic developments

tools installed, like make, gcc_old(2.95.3), and the

binutils package.

 Full installation instructions:

ftp://ftp.suse.com/pub/suse/i386/supplementary/commercial/Orac

le/docs/920_sles8_install.pdf

Install tips for United Linux 1.0





 Install the orarun.rpm package from either the

SP2 CD

– /UnitedLinux/i586/orarun-1.8-18.i586.rpm

or from

– ftp://ftp.suse.com/pub/suse/i386/supplementary/commerci

al/Oracle/sles-8/orarun.rpm

 orarun.rpm

 update kernel (ie. shmmax, shmmin)

 UDP settings (256K)

 Installs and configures hangcheck-timer

Prepare Linux Environment



 Follow these steps on EACH node of the

cluster



– Set Kernel parameters in /etc/sysctl.conf



– Add hostnames to /etc/hosts file



– Establish file system or location for ORACLE_HOME

(writable for oracle userid)



– Setup host equivalence for oracle userid (.rhosts)

Installation Flowchart for OCFS

Download the latest

Create partition on the

OCFS rpm’s from

primary node

www.ocfs.org





Run ocfstool to format and

Install the rpm’s on all nodes

mount your new filesystem





Run ocfstool as root

Mount the new filesystem

(configures /etc/ocfs.conf)

on all nodes

on all nodes



Run load_ocfs Edit rc.local or equivalent add

(insmod will load ocfs.o) load_ocfs and ‘mount –t

on all nodes ocfs select * from gv$instance

 RAC communicating over the private Interconnect

SQL> oradebug setmypid

SQL> oradebug ipc

SQL> oradebug tracefile_name

/home/oracle/admin/RAC92_1/udump/rac92_1_ora_1343841.trc

– Check trace file in the user_dump_dest:

SSKGXPT 0x2ab25bc flags info for network 0

socket no 10 IP 204.152.65.33 UDP 49197

sflags SSKGXPT_UP

info for network 1

socket no 0 IP 0.0.0.0 UDP 0

sflags SSKGXPT_DOWN

 RAC is using desired IPC protocol: Check Alert.log

...

cluster interconnect IPC version:Oracle UDP/IP

IPC Vendor 1 proto 2 Version 1.0

PMON started with pid=2

...

 Use cluster_interconnects only if necessary

Configure srvconfig / srvctl

 SRVCTL uses information from srvconfig

– Reads $ORACLE_HOME/srvm/config /srvConfig.loc

information

 File can be a RAW Device or OCFS file

 Srvconfig -init

 gsd must be running

 Add ORACLE_HOME

– $ srvctl add database -d db_name -o oracle_home [-m

domain_name] [-s spfile]

 Add instances (for each instance enter the command)

– $ srvctl add instance -d db_name -i sid -n node

Application Deployment





 Same guidelines as single instance

– SQL Tuning

– Sequence Caching

– Partition large objects

– Use different block sizes

– Tune instance recovery

– Avoid DDL

– Use LMT‟s and ASSM as noted earlier

Agenda

 Planning Best Practices

– Architecture

– Expectation setting

– Objectives and success criteria

– Project plan

 Implementation Best Practices

– Infrastructure considerations

– Installation/configuration

– Database creation

– Application considerations

 Operational Best Practices

– Backup & Recovery

– Performance Monitoring and Tuning

– Production Migration

Operations



 Same DBA procedures as single instance, with some

minor, mostly mechanical differences.

 Managing the Oracle environment

– Starting/stopping cluster services (ocmstart.sh)

– Starting/stopping gsd

– Managing multiple redo log threads

 Startup and shutdown of the database

– Use srvctl

 Backup and recovery

 Performance Monitoring and Tuning

 Production migration

Operations: srvconfig / srvctl

 Use SRVCTL to administer your RAC database

environment.

– OEM and the Oracle Intelligent Agent use the configuration

information that SRVCTL generates to discover and

monitor nodes in your cluster.

 Global Services Daemon (GSD) receives requests

from SRVCTL to execute administrative job tasks,

such as startup or shutdown.

– GSD must be started on all the nodes in your RAC

environment so that the manageability features and tools

operate properly. (GSDCTL)

Operations: Backup & Recovery

 RMAN is the most efficient option for Backup &

Recovery

– Managing the snapshot control file location.

– Managing the control file autobackup feature.

– Managing archived logs in RAC – choose proper archiving

scheme.

– Node Affinity Awareness



 RMAN and Oracle Net in RAC apply

– you cannot specify a net service name that uses Oracle

Net features to distribute RMAN connections to more than

one instance.



 Oracle Enterprise Manager

– GUI interface to Recovery Manager

Performance Monitoring and Tuning

 Tune first for single instance 9i

 Use Statspack:

– Separate 1 GB tablespace for Statspack

– snapshots at 10-20 min intervals during stress testing, hourly during

normal operations

– Run on all instances, staggered

 Supplement with scripts/tracing

– Monitor V$SESSION_WAIT to see which blocks are involved in

wait events

– Trace events like 10046/8 can provide additional wait event details

– Monitor Alert logs and trace files, as on single instance

 Oracle Performance Manager

 RAC-specific views

 Supplement with System-level monitoring

– CPU utilization never 100%

– I/O service times never > acceptable thresholds

– CPU run queues at optimal levels

Performance Monitoring and Tuning

 Obvious application deficiency on a single node can‟t be solved

by multiple nodes.

– Single points of contention.

– Not scalable on SMP

– I/O bound on single instance DB

 Tuning on single instance DB to ensure applications scalable

first

– Identify/tune contention using v$segment_statistics to identify

objects involved

– Concentrate on the top 5 Statspack timed events if majority of

time is spent waiting

– Concentrate on bad SQL if CPU bound

 Maintain a balanced load on underlying systems (DB, OS,

storage subsystem, etc. )

– Excessive load on individual components can invoke aberrant

behaviour.

Performance Monitoring and Tuning



 Deciding if RAC is the performance bottleneck

– Amount of Cross Instance Traffic

 Type of requests

 Type of blocks

– Latency

 Block receive time

 buffer size factor

 bandwidth factor

Production Migration

 Adhere to strong Systems Life Cycle Disciplines

– Comprehensive test plans (functional and stress)

– Rehearsed production migration plan

– Change Control

 Separate environments for Dev, Test, QA/UAT,

Production

 System AND application change control

 Log changes to spfile

– Backup and recovery procedures

– Security controls

– Support Procedures

Next Steps….



 Recommended sessions

– List 1 or 2 sessions that complement this session

 Recommended demos and/or hands-on labs

– List of or two demos or labs that will let them see this

product in action.

 See Your Business in Our Software

– Visit the DEMOgrounds for a customized architectural

review, see a customized demo with Solutions Factory, or

receive a personalized proposal. Visit the DEMOgrounds

for more information.

 Relevant web sites to visit for more information

– List urls here.

Reminder –

please complete the OracleWorld

online session survey



Thank you.

Resources

 RedHat Linux

– http://www.redhat.com/oracle/





 Linux Center - Technical White Papers & Documentation

– http://otn.oracle.com/tech/linux/tech_wp.html





 “Tips and Techniques: Install and Configure Oracle9i on Red

Hat Linux Advanced Server” by Deepak Patel, Oracle

Corporation

• http://otn.oracle.com/tech/linux/pdf/installtips_final.pdf





 “Tips and Techniques: Install and Configure Oracle9i on

SLES8 / United Linux 1.0

• http://www.suse.com/en/business/certifications/certified_software/oracle/db

/9iR2_sles8.html

United Linux 1.0 Resources

 United Linux

– http://www.unitedlinux.com

 SuSE

– http://www.suse.com/us/business/products/server/sles/index.html



 Connectiva

– http://www.connectiva.com



 SCO Group (Formerly Caldera System)

- http://www.ebizenterprises.com/page1.asp?p=463

 TurboLinux

– http://www.turbolinux.com/

Recommended one-off patches

 Bug 2820871 - ORA-29740 NODE EVICTION

DESIGN ALGORITHM AND ABRUPT TIME

CHANGE

ARU: 9.2.0.3 ARU 4161735 completed for

LINUX Intel



 Bug 2420930 - GET ORA-600 [KSXPMPRP1]

DURING STARTUP IN RAC MODE WITH

LARGER BUFFERS. This was mysteriously

included in 9.2.0.2, but not in 9.2.0.3. Bug

2875050 was opened for this issue.

ARU: 9.2.0.3 ARU 4202164completed for LINUX

Intel



 Bug 2420930 - GET ORA-600 [KSXPMPRP1]

Recommended one-off patches



 Bug:2844009 - MISSING LIBCXA.SO.3

LIBRARY ISSUE IN PSR 9203.

ARU: 9.2.0.3 ARU 4046387 completed for

LINUX Intel



 Bug 2779294 – node_list does not populated into

oraInventory/ContentsXML/inventory.xml.

opatch install will only apply to local node.

Workaround is editing inventory.xml documented

in bug 2742686.



 Bug 2646914, 2675090, 2706220 and 2695783 -

ORA-600 [KCCSBCK_FIRST], [2] on linux and

W2K platform after installing 9.2.0.2. Very

important patch, missing from 9.2.0.3

Hangcheck-timer and Oracle

Cluster Manager

 Download Patch 2594820 from Metalink

– #rpm -ivh

 Detaching watchdogd from the Cluster Manager (Bug

2495915)

The removal of the watchdogd

 ORACLE_HOME/oracm/admin/cmcfg.ora

– WatchdogTimerMargin

– WatchdogSafetyMargin

 KernelModuleName=hangcheck-timer

 CMDiskFile from optional to mandatory

– CM quorum partition of cluster participation.

Hangcheck-timer and Oracle

Cluster Manager



 remove or comment out from the /etc/rc.local

file:

/sbin/insmod softdog nowayout=0 soft_noboot=1

soft_margin=60



ADD to rc.local, execute as root to load



/sbin/insmod hangcheck-timer.o hangcheck_tick=30

hangcheck_margin=180

Hangcheck-timer and Oracle

Cluster Manager

 inclusion of the hangcheck-timer kernel module

Parameter Service Value

----------------- ----------------- --------

-------

hangcheck_tick hangcheck-timer 30 seconds



hangcheck_margin hangcheck-timer 180 seconds



KernelModuleName oracm

hangcheck-timer



MissCount oracm >

hangcheck_tick



hangcheck_margin

Hangcheck-timer and Oracle

Cluster Manager



 cmcfg.ora example

- HeartBeat=15000

- ClusterName=Oracle Cluster Manager, version 9i

- KernelModuleName=hangcheck-timer

- PollInterval=1000

- MissCount=215

- PrivateNodeNames=int-node1 int-node2

- PublicNodeNames=node1 node2

- ServicePort=9998

- CmDiskFile=/ocfsdisk1/quorum/quorumfile

- HostName=int-node1

Hangcheck-timer and Oracle

Cluster Manager



 Parameters for ocmargs.ora

- oracm

- norestart 1800

Linux Monitoring and Configuration

Tools

- Overall tools sar, vmstat

- CPU /proc/cpuinfo, mpstat, top

- Memory /proc/meminfo, /proc/slabinfo, free

- Disk I/O iostat

- Network /proc/net/dev, netstat, mii-tool

- Kernel Version and Rel. cat /proc/version

- Types of I/O Cards lspci –vv

- Kernel Modules Loaded lsmod, cat /proc/modules

- List all PCI devices (HW) lspci –v

- Startup changes /etc/sysctl.conf, /etc/rc.local

- Kernel messages /var/log/messages, /var/log/dmesg

- OS error codes /usr/src/linux/include/asm/errno.h

- OS calls /usr/sbin/strace-p

Post Installation

Increasing Address Space

Default 1.7 GB of address space for its SGA.

 Shutdown all instances of Oracle

 cd $ORACLE_HOME/lib

 cp -a libserver9.a libserver9.a.org

– to make a backup copy

 cd $ORACLE_HOME/rdbms/lib

 genksms -s 0x15000000 >ksms.s

– lower SGA base to 0x15000000

 make -f ins_rdbms.mk ksms.o

– compile in new SGA base address

 make -f ins_rdbms.mk ioracle (relink)

Post Installation



Increasing Address Space Cont.



 sysctl –w kernel.shmmax=3000000000



 Lower process base

– Find out the pid of the process (shell) from where

oracle will be started using ps (Oracle - echo $$)



– changing /proc/$pid/mapped_base to 0x10000000

and restarting oracle





 Metalink Note: 200266.1

Post Installation

Default After Relink

0xFFFFFFFF 0xFFFFFFFF

Reserved for Reserved for

kernel kernel

0xC0000000 0xC0000000

Variable SGA Variable SGA





DB Buffers sga_base

(SGA) (relink Oracle) DB Buffers

(SGA)

0x50000000 mapped_base

0x40000000 (/proc//mapped_base)

Code, etc. 0x15000000

0x10000000 Code, etc.

0x00000000 0x00000000





Lowering of mapped base

Post Installation

Larger Buffer Cache does buffer cache increase with larger

SGA



 Create an in-memory file system on the /dev/shm

 mount -t shm shmfs -o size=8g /dev/shm





 To enable the extended buffer cache feature, set the init.ora

paramter

 USE_INDIRECT_DATA_BUFFERS = true





 Don‟t Use dynamic cache parameters

 DB_CACHE_SIZE

 DB_#K_CACHE_SIZE





Limitations apply to the extended buffer cache feature on Linux:

You cannot change the size of the buffer cache while the instance is running.

Post Installation

Adjust send / receive buffer size to 256K

Tuning the default and maximum window sizes:

- /proc/sys/net/core/rmem_default - default receive

window

- /proc/sys/net/core/rmem_max - maximum receive

window

- /proc/sys/net/core/wmem_default - default send window

- /proc/sys/net/core/wmem_max - maximum send

window



- sysctl -w net.core.rmem_max=262144

- sysctl -w net.core.wmem_max=262144

- sysctl -w net.core.rmem_default=262144

Post Installation



 To enable asynchronous I/O must re-link

Oracle to use skgaioi.o

– cd to $ORACLE_HOME/rdbms/lib

– make -f ins_rdbms.mk async_on

– make -f ins_rdbms.mk ioracle



– set 'disk_asynch_io=true' (default value is true)

– set 'filesystemio_options=asynch„ (RAW Only)



Other docs by xiang
The Parable of the Rich Fool
Views: 23  |  Downloads: 0
14838-Nat.Equest Summer 08-2
Views: 7  |  Downloads: 0
kompendium_februar_01
Views: 1  |  Downloads: 0
Antimikrobielle Wirkung ausgewhl
Views: 2  |  Downloads: 0
Vietnamese BULLETIN vietnamien
Views: 1  |  Downloads: 0
Information Retrieval Models and
Views: 19  |  Downloads: 0
Download our Menu - Aveda Institutes
Views: 2  |  Downloads: 0
Journ茅e mondiale de l'hydrograph
Views: 2  |  Downloads: 0
SJSAS
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!