Embed
Email

High Availability with OpenSolaris

Document Sample

Shared by: chenmeixiu
Categories
Tags
Stats
views:
2
posted:
12/2/2011
language:
English
pages:
47
High Availability with OpenSolaris









Nicholas Solter

Technical lead, Open HA Cluster 2009.06 and co-author of

OpenSolaris Bible

Learn how to increase the availability of your

applications on OpenSolaris with Service

Management, Fault Management and Open

High Availability Cluster









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 2

Agenda





Understanding Availability

Service and Fault Management

Solaris Cluster / Open High Availability Cluster

Making Applications HA

Open HA Cluster 2009.06









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 3

Agenda





Understanding Availability

Service and Fault Management

Solaris Cluster / Open High Availability Cluster

Making Applications HA

Open HA Cluster 2009.06









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 4

What is High Availability?



Computer systems

provide services

• Web Services, Databases,

Business Logic, File

Systems, etc. Failures are Inevitable

• Software bugs

• Hardware components

• People and Processes

Downtime is costly • Natural disaster

• Services should be • Terrorism

available as close as

possible to 100% of the

time









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 5

Types of Failures

Both Hardware and Software

Single points of failure can be catastrophic







Network

Application





Server/OS







Storage

Single Points Path/HBA

of failure

Storage







2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 6

You don't want your users to see this...









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 7

So how do you keep your services available?









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 8

OpenSolaris High Availability

OpenSolaris provides several options for automating the

recovery process from inevitable failures to minimize

downtime and cost



Open HA Cluster Geographic Edition protects

against site-wide failures

Increased Open High Availability Cluster uses hardware

Availability redundancy to protect against hardware faults



Predictive Self-Healing with Fault Management

Architecture (FMA) and Service Management

Facility (SMF)



2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 9

Agenda





Understanding Availability

Service and Fault Management

Solaris Cluster / Open High Availability Cluster

Making Applications HA

Open HA Cluster 2009.06









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 10

Predictive Self-Healing

Combination of Fault Management Architecture (FMA) and

Service Management Facility (SMF)

Without Predictive Self-Healing

• Hardware and software faults handled in ad-hoc way

• Minimal automated detection or repair

• At best system logs message for administrator

With Predictive Self-Healing

• Unified error handling channels

• Unified fault management

• Unified service management

• Automated recovery when possible

• Unified knowledge base articles







2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 11



192 GB system:

46% reduction in

annual downtime



System: 6 CPU, 12 core



42% reduction in

annual interruption rate



16 GB system:

32% reduction in

annual downtime



System: 4 CPU



44% reduction in

annual interruption rate









Predictive Self Healing:

Memory Diagnosis and Retire

2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 12

Fault Management



Heuristic approach to detecting faults

Primarily for hardware

• But some software instrumented as well

Error reports generated by components

• Mostly hardware drivers

Fault Management Daemon processes error reports and

generates fault diagnoses

• Logged to system log

• Action taken when possible (eg. Offlining failed CPU)

Each diagnosis has unique Sun Message Identifier with

corresponding knowledge article

As administrator, you don't usually need to worry about

FMA

2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 13

Service Management



The Traditional Approach

• Processes started by init scripts

• No concept of services

• No monitoring or restarts of applications

• No grouping of related processes

Service Management Facility (SMF)

• Introduced in Solaris 10

• Concept of service groups related processes

• Unified mechanism to start/stop services, Specify dependencies,

Specify configuration properties

• Orderly startup/shutdown of system









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 14

SMF Administrative Commands



svcs

• See state of services

• “svcs -x” for information about faulted services

svcadm

• enable/disable services

svccfg and svcprop

• Configure and retrieve service properties









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 15

SMF in Action

using the Apache Webserver service









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 16

Agenda





Understanding Availability

Service and Fault Management

Solaris Cluster / Open High Availability Cluster

Making Applications HA

Open HA Cluster 2009.06









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 17

Platform for High Availability



Tolerates Single Points of Failure (and some

double failures)

Hardware redundancy with off-the-shelf hardware

Monitors cluster and orchestrates recovery of

applications and cluster infrastructure

Solaris Cluster runs on Solaris 10

Open HA Cluster runs on OpenSolaris









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 18

Open HA Cluster Stack









Applications



Heartbeats Agents

Membership

Cluster

Infrastructure

OS

2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 19

Cluster Monitoring



Monitors all levels of hardware/software stack

• Physical node health

• Robust membership with quorum to ensure that

there can be only one operational cluster in the

case of network partitions

• Disk Fencing ensures Data Integrity in spite of

failures

• Network

• Disk paths to shared storage

• Quorum Devices

• Applications



2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 20

Failover Service









HA Failover Service



Application failover

● Failover IP address



Failover storage









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 21

Global Network Service

Scalable Service Provides Global IP address

with failure protection









Scalable Service

Software Load Balancing









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 22

Global Network Service

Example: Apache and MySQL Provides Global IP address

with failure protection









MySQL (Failover)







Apache

(Scalable)







2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 23

Open HA Cluster Architecture Global Network Service

Provides Global IP address

Scalable Service with failure protection

Software Load Balancing

Monitoring









HA Failover Service



Heartbeats

Membership Resource Group Manager



Resource (application) dependenc

● Inter RG dependencies

Quorum ●

RG affinities

Disk Fencing

Global File Service

Failover File service

2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 24

High Availability

Even in the Presence of Rogue Data

Center Admins









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 25

Agenda





Understanding Availability

Service and Fault Management

Solaris Cluster / Open High Availability Cluster

Making Applications HA

Open HA Cluster 2009.06









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 26

Cluster Agents (Data Services)



Applications run on cluster unmodified (off-the-shelf)

Cluster Agents are the “glue” layer between applications and cluster

infrastructure

• Custom agent for each application

• Interacts with cluster core through APIs

• Provides start, stop, and other commands specific to the application to be

called by the cluster framework

• Provides monitor daemon specific to the application









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 27

Writing your own Agents

Don't need to wait for us to make your favorite

application HA!

Several development choices available including

• Agent Builder

• Completely automated two-step process with GUI

• Generic Data Service (GDS) coding template

• http://opensolaris.org/os/community/ha-clusters/ohac/GDS-template/

• Code for complex application requirements with simplified

interface to cluster framework

• SMF Proxy

• Make SMF services HA without additional programming or

scripting









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 28

Agent Builder









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 29

Which Applications Can be HA?

Most off-the-shelf applications can be made HA

Applications must be...

• Crash tolerant -- able to restart correctly after an unclean

shutdown

• Independent from server hostname -- changes with a failover!

• Tolerant of multihosted data

• Application should not hardcode data paths

• Sometimes symbolic links can be used as workarounds









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 30

Agenda





Understanding Availability

Service and Fault Management

Open High Availability Cluster

Making Applications HA

Open HA Cluster 2009.06









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 31

Open HA Cluster 2009.06 Features



Runs on OpenSolaris 2009.06

Based on Solaris Cluster 3.2

• Well-tested production-level code base

• Most features from 3.2 available

Free to use (without support)

• Support subscriptions available

Software Modularization

Hardware Minimization

Available as IPS packages from

https://pkg.sun.com/opensolaris/ha-cluster repository

Source is open and freely available at

http://www.opensolaris.org/os/community/ha-clusters/



2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 32

OHAC 2009.06 Supported Hardware





x64 SPARC

- Sun Fire x4170 - Sun SPARC Enterprise M3000

- Sun Fire x4140 - Sun SPARC Enterprise T5120



Storage

- StorageTek 2540 Array

- Storage J4400 Array (SAS)







Also runs on other platforms, including non-Sun hardware





2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 33

Agents on Open HA Cluster 2009.06

Apache Webserver

Apache Tomcat

MySQL

GlassFish

NFS

DHCP

DNS

Kerberos

Samba

HA Containers (ipkg Zones)





2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 34

OHAC 2009.06 Software Modularization



Install ha-cluster-full package

• Get core, wizards, agents, man pages, l10n, … (everything)

Install ha-cluster-minimal package

• Get only core framework

• Add agents, wizards, l10n, etc. individually

Install quorum server and agent builder without core

framework

Minimized installation useful for

• Minimizing resource use

• Security minimization

• Reducing administrative overhead







2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 35

OHAC 2009.06 Hardware Minimization



“Poor man's shared storage” with COMSTAR iSCSI and

ZFS

Crossbow VNICs for private cluster traffic over public

network

“Weak membership” (preview-only feature)









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 36

COMSTAR iSCSI storage with OHAC 2009.06

Node 1 Node 2



Mirrored

Zpool









iSCSI iSCSI

Initiator Initiator









iSCSI iSCSI

Target Target









Local Local

Disk Disk





2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 37

Crossbow VNICs with OHAC 2009.06



Cluster private interconnect can use VNICs as

endpoints instead of physical adapters

Works over dedicated physical adapter or public

adapter

Use IPsec to protect cluster-private traffic

Benefits

• Resource consolidation: Share physical adapters

• Easier setup: No dedicated private physical adapters and

cabling required









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 38

Weak Membership (preview feature only)



Run a two-node cluster without a quorum device

External “ping target” used to arbitrate in case of split-

brain

Worst-case in split-brain, both nodes stay up and

provide service

• Though Open HA Cluster integrated with OpenSolaris DAD

(Duplicate Address Detection) to prevent logical hostname

from coming online if duplication detected

Places importance of availability above data integrity

• Can lead to data loss

Use cases

• Read-only, or read-mostly, applications (e.g. database for

internet news site where reads dwarf updates)

• Test cluster with limited resources

• Demos

2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 39

Open HA Cluster 2009.06 Minimized Hardware









Heartbeats over

public network

Quorum

server





No shared

storage









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 40

Installing Open HA Cluster 2009.06



Accept terms of use at pkg.sun.com and download key

and certificate to /var/pkg/ssl/

Set ha-cluster publisher (on all nodes)

• pkg set-publisher -k

/var/pkg/ssl/Open_HA_Cluster_2009.06.key.pem -c

/var/pkg/ssl/Open_HA_Cluster_2009.06.certificate.pem -O

https://pkg.sun.com/opensolaris/ha-cluster/ ha-cluster

Install the cluster software (on all nodes)

• pkg install ha-cluster-full

Configure the cluster (on one node)

• /usr/cluster/bin/scinstall









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 41

Open HA Cluster 2009.06 Support Repository



For critical bug fixes

Requires support subscription to access









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 42

Open HA Cluster

2009.06 in Action

Make Apache Webserver HA









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 43

Summary



Application availability is important

FMA and SMF enhance availability on single-node

OpenSolaris system

Open HA Cluster increases the availability of the

system as whole through hardware redundancy and

software monitoring

Open HA Cluster 2009.06 runs on OpenSolaris 2009.06









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 44

For More Information

OpenSolaris Bible (Solter, Jelinek, and Miner; Wiley,

2009)

Open HA Cluster and OpenSolaris booths at

CommunityOne today!

SMF and FMA

• Knowledge Article Web (for FMA): http://www.sun.com/msg/

• Fault Management Community Group:

http://opensolaris.org/os/community/fm/

• SMF Community Group:

http://opensolaris.org/os/community/smf/









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 45

For More Information

Open HA Cluster

• Cluster Summit (last Sunday):

http://wikis.sun.com/display/OpenSolaris/Open+HA+Cluster+S

ummit+May+2009

• Video link: http://blogs.sun.com/video/tags/cluster

• HA Clusters Community Group on OpenSolaris.org:

http://www.opensolaris.org/os/community/ha-clusters/

• Solaris Cluster blog: http://blogs.sun.com/SC/

• My blog: http://blogs.sun.com/nsolter

• Open HA Cluster 2009.06 Documentation:

http://opensolaris.org/os/community/ha-

clusters/ohac/Documentation/OHACdocs









2009 CommunityOne Conference: WEST | developers.sun.com/events/communityone 46

High Availability with

OpenSolaris









Nicholas Solter

nicholas.solter@sun.com



Related docs
Other docs by chenmeixiu
aapex-show-laswegas-participation-letter
Views: 0  |  Downloads: 0
Age of Exploration
Views: 12  |  Downloads: 0
Commercial real estate outlook
Views: 1  |  Downloads: 0
COMMUNITY MORTGAGE PROGRAM _CMP_
Views: 3  |  Downloads: 0
Silent Auction
Views: 7  |  Downloads: 0
CHAPTER ONE
Views: 0  |  Downloads: 0
47-674
Views: 0  |  Downloads: 0
Week 8 - Unito.it
Views: 1  |  Downloads: 0
December 3_ 2009 Issue _17
Views: 2  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!