Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

Solution Facts deutsch

VIEWS: 0 PAGES: 21

									          White Paper • Issue January 2003



          Questions / Answers
          considering High Availability




                          Abstract
                          The purpose of this document is to provide answers to various questions relating to high availability
                          (HA) in general, to high availability and cluster technology, to the offer of Fujitsu Siemens Computers
                          for enhancing availability, and to the product families HIPLEX AF (Highly Integrated system
                          comPLEX Availability Facility) for BS2000/OSD, PRIMECLUSTER for the platforms Solaris and
                          Linux, Reliant Cluster for the platform Reliant Unix, and MS Cluster Service (MS CS) for the MS
                          Windows 2000 platform. This information should help sales, pre-sales, marketing, and support staff
                          to answer queries regarding this complex. It can also help sales staff to develop new business.




www.fujitsu-siemens.com
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 2 of 21



Content

1   Questions considering high availability
  1.1    Technical Questions considering high availability in general
    1.1.1     What is availability?
    1.1.2     What are the definitions of the terms MTBF and MTTR?
    1.1.3     Which downtimes correspond to different levels of availability?
    1.1.4     How is availability to be seen in contrast to reliability?
    1.1.5     How can the availability of a system be improved?
    1.1.6     Can high availability be introduced in a stepwise manner?
    1.1.7     For which kinds of planned downtimes does FSC offer a solution?
    1.1.8     For which kinds of unplanned downtimes does FSC offer a solution?
    1.1.9     How can availability and up-to-dateness of data be improved?
    1.1.10       How can loss of data be avoided when an unplanned downtime occurs?
    1.1.11       Can high availability be combined with disaster recovery?
  1.2    Technical Questions considering high availability and cluster technology
    1.2.1     What is a high availability cluster?
    1.2.2     Is cluster technology necessary for achieving high availability?
    1.2.3     Which degree of application availability can be achieved in a high availability cluster?
    1.2.4     Which kinds of high availability cluster does FSC support?
    1.2.5     What is the difference between a fail-over cluster and a high availability cluster?
    1.2.6     What are the advantages of a high availability cluster compared to a single system considering planned
    downtimes?
    1.2.7     What are the advantages of a high availability cluster compared to a single system considering unplanned
    downtimes?
    1.2.8     Can a high availability cluster also be established on one hardware system?
    1.2.9     What is the maximum distance allowed between two nodes in a high availability cluster?
    1.2.10       What is the difference between a hot and a cold standby system?
    1.2.11       Do the standby systems have to be dimensioned exactly like the production system?
    1.2.12       Can single points of failure be avoided in a high availability cluster?
    1.2.13       Can applications also be monitored in a high availability cluster?
    1.2.14       Which requirements must applications satisfy so that they can be monitored in a high availability cluster?
    1.2.15       Does FSC offer high availability solutions for standard applications?
    1.2.16       Can distributed applications also be supported in a high availability cluster?
    1.2.17       Can a dynamic workload balancing be established between several cluster nodes?
    1.2.18       Can in a high availability cluster be adequately reacted on changes of the workload?
    1.2.19       Must the user know on which node the application is currently running?
    1.2.20       Which time is necessary for performing a fail-over in some characteristic configurations?
  1.3    Sales-related Questions / Central Support by Fujitsu Siemens Computers
    1.3.1     What is the added value of HIPLEX AF, PRIMECLUSTER, Reliant Cluster, and MS CS for customers?
    1.3.2     What is the added value of HIPLEX AF, PRIMECLUSTER, Reliant Cluster, and MS CS for FSC?
    1.3.3     How does a customer get a well-tailored concept for improving the availability of her/his configuration?
    1.3.4     How are the critical business processes of a customer identified that must be preserved in an emergency
    situation?
    1.3.5     How can one detect components that tamper with the availability of an existing configuration (component
    failure impact analysis)?
    1.3.6     What expenses have to be taken into account when introducing a high availability cluster?
    1.3.7     Can the monitoring and fail-over of applications be demonstrated in some representative high availability
    cluster?
    1.3.8     Are there success stories associated with the introduction of high availability clustering?
    1.3.9     What kinds of service contracts comply with high availability requirements?
2 Contacts
3 Literature
Appendix: Platform-specific questions
A1     Questions/Answers considering HIPLEX AF
  A1.1      Technical Questions
    A1.1.1       What is the functionality of HIPLEX AF?
    A1.1.2       Which are the basic components in a HIPLEX AF cluster?
    A1.1.3       Which are further useful components for enriching a HIPLEX AF cluster?
    A1.1.4       Which general requirements must be fulfilled in order to be able to introduce HIPLEX AF?
    A1.1.5       Can different versions of BS2000/OSD and/or different versions of HIPLEX AF be installed on the nodes
    of a cluster?
    A1.1.6       Which BS2000/OSD versions and hardware models are supported?
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 3 of 21

    A1.1.7      What is the maximum number of nodes in a HIPLEX AF cluster?
    A1.1.8      Can HIPLEX AF distinguish between an actual system failure and a bogus failure?
    A1.1.9      What is the amount of performance deterioration when a system is part of a HIPLEX AF cluster
    compared to a single system?
    A1.1.10     What are the advantages of the HDR-package (High availability and Disaster Recovery) and how can
    one order it?
    A1.1.11     What are the advantages of the HIPLEX-SDI solution (Software Distribution and Installation) and how
    can one order it?
    A1.1.12     What is the purpose of SDF procedures (wizards)?
  A1.2      Sales-related Questions / Central Support by Fujitsu Siemens Computers
    A1.2.1      Where can one find a list of the HIPLEX AF order units?
    A1.2.2      How many installations of HIPLEX AF roughly exist at present?
    A1.2.3      Are there success stories associated with the introduction of HIPLEX AF?
    A1.2.4      What sales training is available?
    A1.2.5      What technical training is available?
    A1.2.6      How does HIPLEX AF compare with the Automatic Restart Manager (ARM) of IBM’s Parallel Sysplex?
A2    Questions/Answers considering PRIMECLUSTER
  A2.1      Technical Questions
    A2.1.1      Where can one find a description of the PRIMECLUSTER functionality?
    A2.1.2      Which assets does the PRIMECLUSTER product family comprise?
    A2.1.3      Which are the basic components in a PRIMECLUSTER cluster?
    A2.1.4      Which are further useful components for enriching a PRIMECLUSTER cluster?
    A2.1.5      Which general requirements must be fulfilled in order to be able to introduce PRIMECLUSTER?
    A2.1.6      Does FSC offer high availability solutions for standard applications in a PRIMCLUSTER cluster?
    A2.1.7      Can different versions of the operating system and/or different versions of PRIMECLUSTER be installed
    on the nodes of a cluster?
    A2.1.8      What is the maximum number of nodes in a PRIMECLUSTER cluster?
    A2.1.9      Can PRIMECLUSTER distinguish between an actual system failure and a bogus failure?
    A2.1.10        What is the amount of performance deterioration when a system is part of a PRIMECLUSTER cluster
    compared to a single system?
    A2.1.11        What is the purpose of RMS Wizard Tools and Application Wizards?
  A2.2      Sales-related Questions / Central Support by Fujitsu Siemens Computers
    A2.2.1      Where can one find a list of the PRIMECLUSTER order units?
    A2.2.2      How many installations of PRIMECLUSTER/Reliant Cluster roughly exist at present?
    A2.2.3      Are there success stories associated with the introduction of PRIMECLUSTER?
    A2.2.4      What sales training is available?
    A2.2.5      What technical training is available?
    A2.2.6      How does PRIMECLUSTER compare with other high availability clusters (e.g. SUN cluster, VERITAS
    cluster)?
A3    Questions/Answers considering MS Cluster Service MS CS
  A3.1      Technical Questions
    A3.1.1      Where can one find descriptions of the PRIMERGY High Availability for MS Windows?
    A3.1.2      Which basic components form an MS CS cluster on PRIMERGY?
    A3.1.3      Which are further useful components for enriching a PRIMERGY MS CS-based cluster?
    A3.1.4      Which general requirements have to be fulfilled in order to be able to introduce, market and deploy MS
    CS?
    A3.1.5      Does FSC offer high availability solutions for standard applications in a PRIMERGY MS CS-based
    cluster?
    A3.1.6      Can different versions of the operating system and/or different versions of the Cluster Manager be
    installed on the nodes of a cluster?
    A3.1.7      What is the maximum number of nodes in a PRIMERGY MS CS cluster?
    A3.1.8      What is the amount of performance deterioration when a PRIMERGY server is part of an MS CS cluster?
    A3.1.9      What is the purpose of the PRIMERGY BCC configurations (Business Critical Computing) and which
    advantages do they have?
  A3.2      Sales-related Questions / Central Support by Fujitsu Siemens Computers
    A3.2.1      Where can one find order information for the two MS operating systems that support MS CS?
    A3.2.2      How many cluster installations of PRIMERGY with MS CS roughly exist at present?
    A3.2.3      Are there success stories associated with the introduction of PRIMERGY MS CS clusters?
    A3.2.4      What sales training is available?
    A3.2.5      What technical training is available?
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 4 of 21



1 Questions considering high availability
1.1     Technical Questions considering high availability in general
1.1.1      What is availability?
IEEE defines availability as the degree to which a system or a component is operational and accessible when required
for use by an authorized user. Availability is expressed as a probability by the term
                                                       Error! x 100 (%).
High availability is not a specific technology; it is rather an objective that needs to be tailored for the specific situation of a
company.
A combination of strategies, technologies, training of the employees, and different levels of services is necessary in order
to elevate the operating system, middleware, network and application(s) onto a level that is acceptable for the customer.
In practice, one distinguishes between planned and unplanned downtimes.
                                                                                                            (Back to Table of Contents)

1.1.2      What are the definitions of the terms MTBF and MTTR?
MTBF and MTTR are statistical expressions: for a system or a component, MTBF is defined as the Mean Time Between
successive Failures (or other types of outage), whereas MTTR is defined as the Mean Time To Repair such a failure
and/or to recover from it.
Availability of a system or component can also be expressed in terms of MTBF and MTTR by
                                                    Error! x 100 (%).
Note that when embedded in a large and complex system, the values of MTBF and MTTR for a selected partial compo-
nent may considerably differ compared to the values for the very same component in an isolated environment.
                                                                                                            (Back to Table of Contents)

1.1.3      Which downtimes correspond to different levels of availability?
By using the formula mentioned in section 1.1.1, the following figures can be given (adopted from Gigagroup).
The amount of downtime, which is accepted by the customer, strongly depends on his/her business needs. This requires
an in-depth analysis of the complexity of the operation and will influence the size and cost of a high-availability solution.
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 5 of 21



1.1.4      How is availability to be seen in contrast to reliability?
Reliability of software is one of the quality factors in software engineering and is defined as the extent to which a program
can be expected to perform its intended function with required precision within a specified period of time. Its rating formu-
la is given by
                                                            1 – Error! .
Reliability of hardware components is defined in an analogous manner.
It should be pointed out that availability percentages – in contrast to reliability figures – give no indication of the frequen-
cy of failures and of the duration of any single downtime.
For example, a system with availability of 99% over a whole year could fail four times a day for 3 minutes each time,
which might be just within what can be tolerated. On the other hand, a single failure involving a continuous downtime of
84 hours – leading to the same availability percentage – would be unacceptable for most companies.
                                                                                                          (Back to Table of Contents)

1.1.5      How can the availability of a system be improved?
FSC offers solutions for enhancing the system's availability – besides other measures – by avoiding or reducing the
downtimes when one of the events listed in sections 0 and 0 occurs.
For every involved platform, the solution consists in introducing a high availability cluster (see section 1.2.1) under a
corresponding high availability monitor. Fujitsu-Siemens Computers offers the following HA-monitors in its portfolio:

                                 Platform                         HA-Monitor
                                 BS2000/OSD                       HIPLEX AF
                                 Solaris                          PRIMECLUSTER
                                 Linux                            PRIMECLUSTER
                                 Reliant Unix                     Reliant Cluster (*)
                                 MS Windows 2000 (**)             MS Cluster Service

         (*) Contains compatible partial functionality of PRIMECLUSTER.
         (**) Needed OS version is MS Windows 2000 Advanced Server (2-node cluster support) or Windows 2000 Dat-
         acenter Server (4-node cluster support). MS Windows .NET 2003 Enterprise Server and Windows .NET 2003
         Datacenter Server will be the corresponding successor products in 2003.

In addition to that, further organizational measures have to be taken (see ref. 10). To them belong the arrangement of an
HA-control center, running regular contingency practices, safeguarding the employees’ competences, and enhancing the
quality of service (see section 1.3.9).
The availability of a system can further be improved by establishing a workload balancing cluster (see section 1.2.17).
                                                                                                          (Back to Table of Contents)

1.1.6      Can high availability be introduced in a stepwise manner?
Yes, it can. In a first step, the redundant infrastructure must be established for every platform to be considered.
In general, high availability monitors allow the specifying of resource monitoring in a flexible and granular way. The num-
ber and kind of monitored resources may be changed, and the granularity of monitoring may also be decreased or in-
creased.
These features allow a stepwise introduction of high availability starting with a small number of applications and re-
sources under control of a high availability monitor, which may be increased in further steps.
Additionally, also further systems may be added to the high availability cluster.
Availability and up-to-dateness of data can also be increased step by step using particular functions of the operating
systems, dedicated products, or special features of disk and tape storage subsystems (see section 1.1.9).
Comprehensive configurations comprising HA clusters on several platforms can be integrated into a supervising man-
agement platform using e.g. the SNMP interface (see ref. 18).
The organizational measures can also be taken in a stepwise manner.
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 6 of 21

                                                                                                      (Back to Table of Contents)

1.1.7       For which kinds of planned downtimes does FSC offer a solution?
FSC offers a solution for the following kinds of planned downtimes:
     Archiving, backing up or reorganizing data,
     Introducing, exchanging or upgrading hardware components (incl. maintenance),
     Introducing, exchanging or upgrading software components (in the operating system and application(s)), and/or
     Introducing software corrections (in the operating system and application(s)).
                                                                                                      (Back to Table of Contents)

1.1.8       For which kinds of unplanned downtimes does FSC offer a solution?
FSC offers a solution for the following kinds of unplanned downtimes:
     A failure in a hardware component like a CPU, a peripheral controller or device, or a data connection,
     A failure in the operating system (incl. middleware),
     A failure in the application,
     A failure in the data communication network,
     An operation error of the operator or system administrator, and/or
     The destruction of the entire IT-center ("disaster recovery").
                                                                                                      (Back to Table of Contents)

1.1.9       How can availability and up-to-dateness of data be improved?
Availability and up-to-dateness of online data can be improved by mirroring data either
     on software level by using particular features of the operating system like Dual Recording by Volume (DRV), or
      dedicated products like the Veritas Volume Manager, PRIMECLUSTER Global Disk Services, PRIMERGY
      DuplexDataManager (DDM) for Windows and Linux, or
   on hardware level by using RAID devices and/or, for example, the BCV, SRDF or GeoSpan (for MS Windows) facility
    of Symmetrix, or the SnapView or MirrorView facility of FibreCAT disk storage subsystems.
There must exist a mechanism, which makes all data available on all systems of the high availability cluster.
After having been split off, a mirrored device can also be used for data backup on another system thereby avoiding a
planned downtime in the production system.
For a parallel backing up and archiving of data, in a BS2000/OSD environment the concurrent copy feature can also be
used.
To enable an application to bridge-over an unplanned downtime consistently, it is recommended to use a transaction
mechanism, so that the application can restart on a well-defined consistency point.
                                                                                                      (Back to Table of Contents)

1.1.10      How can loss of data be avoided when an unplanned downtime occurs?
See section 1.1.9.

1.1.11      Can high availability be combined with disaster recovery?
Yes, it can, if the nodes in the cluster are located at sufficiently large distances and remote data mirroring is used. A
suitable means for implementing remote data mirroring on hardware level is, for instance, using the MirrorView facility of
                                                                                                                   2
FibreCAT, or the SRDF or GeoSpan (for MS Windows) facility of Symmetrix disk storage subsystems of EMC , or on a
server SW-based level by using PRIMERGY DuplexDataManager.
In addition, some further organizational measures have to be taken (see ref. 10).
                                                                                                      (Back to Table of Contents)

1.2     Technical Questions considering high availability and cluster technology
1.2.1     What is a high availability cluster?
A high availability cluster consists of a number of systems with a common monitoring and fail-over infrastructure,
which does not have any single point of failure. The resources of the systems – databases, files, applications, and devic-
es, for example – may be made available on a cluster-wide basis. This enables the redundancies that exist in the cluster
to be used throughout the cluster to avoid interruptions.
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 7 of 21

On all systems, applications may run, which are switched to another system in case of a failure. Applications, which are
already running on this target system, may then continue to run there (maybe with restricted performance), or may be
terminated (in case of minor availability requirements for them).
Several features smoothly cooperate in a high availability cluster:
   Rapid, unambiguous detection of failing systems,
   Controlled and safe access to the cluster-wide resources from every system of the HA-cluster,
   Automatic switching to another system in the cluster by an HA monitor in case of a failure, and
   Easy, central administration.
                                                                                                           (Back to Table of Contents)

1.2.2    Is cluster technology necessary for achieving high availability?
With today's state-of-the-art technology, availability greater than 99.99 % (corresponding to about 1 h downtime per year)
can only be achieved with a cluster of systems (see section 1.2.1).
                                                                                                           (Back to Table of Contents)

1.2.3    Which degree of application availability can be achieved in a high availability clus-
         ter?
In special configurations up to 99,999 % can be achieved, corresponding to an downtime of 5 minutes per year ("five
nines, five minutes"). The attainable value mainly depends on the time required for fail-over and thus on the actual appli-
cation and configuration.
                                                                                                           (Back to Table of Contents)

1.2.4    Which kinds of high availability cluster does FSC support?
Currently, FSC supports homogeneous high availability clusters. Here, homogeneous means that a cluster is monitored
by one and the same high availability monitor.
For the HA-monitors offered by FSC see section 1.1.5.
To establish an inhomogeneous high availability cluster comprising, for instance, nodes running under BS2000/OSD and
Solaris, a communication or cooperation between HIPLEX AF and RMS can be envisaged in the framework of a project
(in case of interest, please contact Mr. Fischer, see section 2).
                                                                                                           (Back to Table of Contents)

1.2.5    What is the difference between a fail-over cluster and a high availability cluster?
Fail-over cluster
A fail-over cluster consists of two or more active systems that normally run different applications. If the production system
fails, the main application (or applications) and its/their resources are transferred to the standby system where recovery
takes place. Since all systems in a fail-over cluster can be used productively in the normal case independently from each
other, the fail-over concept represents a cost-efficient solution to the problem of failure management.
However, in a fail-over cluster it is entirely up to the operator to detect the failure, and to carry out the switching of the
applications.

High availability cluster
A high availability cluster is a far more elaborate type of fail-over cluster. In addition to the features described above, a
high availability cluster has a number of mutually independent and automatic monitoring functions for the connected
systems, enabling unambiguous and immediate error detection. The automatic and rapid switching that takes place in
case of a failure further improves the level of availability, thus eliminating the possibility of errors committed by the opera-
tor.
HIPLEX AF, PRIMECLUSTER, Reliant Cluster, and MS CS clusters are examples for high availability clusters.
                                                                                                           (Back to Table of Contents)

1.2.6    What are the advantages of a high availability cluster compared to a single system
         considering planned downtimes?
Introduction, exchange or upgrade of hardware components (incl. maintenance) may be asynchronously per-
formed on a standby system while the application is still running on the production system.
Equally, a new software configuration for the operating system and middleware (including version upgrades and correc-
tion updates) may be asynchronously loaded and started on a standby system while the application is still running on the
production system.
In case of a problem in the operating system or the application after switching of the application, the application can
automatically be switched back to the previous work system, too.
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 8 of 21

                                                                                                         (Back to Table of Contents)

1.2.7    What are the advantages of a high availability cluster compared to a single system
         considering unplanned downtimes?
A failure of a system will be automatically and immediately detected.
In case of a failure, the application(s) will automatically be switched to a standby system, and the failed system can be
repaired in parallel while the production is running on the standby system.
HIPLEX AF, PRIMECLUSTER, Reliant Cluster, and MS CS optionally try to restart an application, which has failed, on
the current node before carrying out a fail-over.
                                                                                                         (Back to Table of Contents)

1.2.8    Can a high availability cluster also be established on one hardware system?
Yes, it can, if the customer esteems the availability of his/her hardware system, but intends to improve the availability of
the software components. Several cluster nodes can be introduced by using different partitions. However, if the customer
wants to increase the availability of both his/her software and hardware systems, this solution is not applicable.
In a PRIMEPOWER environment, a redundant Cluster Interconnect for each clustered partition is required.
In a BS2000/OSD environment, the virtual machine system VM2000 can also be used as basis for the entire cluster (see
section A1.1.2). Here, the HIPLEX-SDI solution helps to minimize downtimes (see section A1.1.11).
In an MS Cluster Service environment, the HA cluster can be established on one HW system by using VMware ESX and
running virtual machines on top of ESX.
                                                                                                         (Back to Table of Contents)

1.2.9    What is the maximum distance allowed between two nodes in a high availability clus-
         ter?
The limitations do not result from restrictions imposed by the high availability monitor, but from the configuration and the
applications running on the cluster. In most cases, remote mirroring and shared data access are applied. The reachable
distances depend on the technology used (e.g. fibre channel) and the minimally acceptable I/O throughput. Current cost
of interconnection technology in conjunction with the speed of light usually limits the extension of clusters to approxi-
mately 10 km for shared data access, and approximately 80 km for synchronous remote mirroring.
In a PRIMECLUSTER environment, latencies in the Cluster Interconnect limit the obtainable distance. PRIMECLUSTER
is released for 10 km fibre optic cable length.
Cluster nodes in an MS CS environment can be spread over 2 sites. The private and public network connections be-
tween those cluster nodes must appear as a single, non-routed LAN, and the network connections must be able to pro-
vide a guaranteed maximum round trip latency between nodes of not more than 500 milliseconds. This allows distances
between the cluster nodes of 50 to 70 km, depending on the carrier and the latency of its lines it is able to guarantee.
                                                                                                         (Back to Table of Contents)

1.2.10 What are the difference between a hot and a cold standby system?
A hot standby system runs while it is in the standby mode (with or without some applications being executed on it),
whereas a cold standby system is not running while being in the standby mode and has to be started before an applica-
tion can be switched to it.
With a cold standby solution, the customer thus experiences a delay in service in comparison with a hot standby solution.
                                                                                                         (Back to Table of Contents)

1.2.11 Do the standby systems have to be dimensioned exactly like the production system?
No, this is not necessary. Initially, the standby system(s) may be equipped with few resources. After a switching has
taken place, these resources may appropriately be re-configured so that the standby system can process the business-
critical applications with the required performance.
Under MS CS, the PRIMERGY models have to be of the same type, but can be equipped with different amounts of re-
sources, like number of CPUs, RAM, network cards, …
In the BS2000/OSD and PRIMEPOWER environments, capacity on demand (CoD) can also be provided as an option,
which is useful in order to enhance the performance of a standby system after a fail-over has taken place. On Windows
2000, CoD for CPUs is technically finished on the development side; a sales-oriented strategy and announcement is
currently being prepared.
                                                                                                         (Back to Table of Contents)

1.2.12 Can single points of failure be avoided in a high availability cluster?
Yes, all components of a high availability cluster are installed redundantly to avoid any single point of failure.
                                                                                                         (Back to Table of Contents)
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 9 of 21

1.2.13 Can applications also be monitored in a high availability cluster?
Sure, monitoring of applications is the main objective in a high availability cluster.
                                                                                                       (Back to Table of Contents)

1.2.14 Which requirements must applications satisfy so that they can be monitored in a
       high availability cluster?
Most general-purpose applications, which run on a single system, can be made to run in a cluster in an unmodified man-
ner. Main requirements for this are that they can be gracefully started and stopped under program control, that they can
recover from failure situations, and that they provide a means to monitor their state of health.
In MS CS environments, applications need to have resource DLLs (Dynamic Link Libraries) that comply with the MS
cluster development kit and its related APIs in order to properly communicate with the Cluster Manager. Default resource
DLLs that are shipped with MS CS fit to simpler SW products, for complex applications the ISV has to provide those
DLLs (like e.g. SAP, Oracle, Siebel).
For further questions in this context please contact the respective person mentioned in section 2.
                                                                                                       (Back to Table of Contents)

1.2.15 Does FSC offer high availability solutions for standard applications?
Yes, see sections A2.1.6 and A3.1.5, respectively.
                                                                                                       (Back to Table of Contents)

1.2.16 Can distributed applications also be supported in a high availability cluster?
Yes, distributed applications are also supported in a high availability cluster. Examples for such applications are the
Oracle Parallel Server OPS, or Oracle RAC (Real Application Cluster).
In MS CS environments, only one cluster manager service can be run at a given time: so this is either OPS, Oracle RAC,
or MS CS. Otherwise two cluster manager products would concurrently try to manage the entire server cluster and there-
fore may cause trouble.
                                                                                                       (Back to Table of Contents)

1.2.17 Can a dynamic workload balancing be established between several cluster nodes?
The PRIMECLUSTER Scalability Server provides a dynamic workload balancing for Internet applications. It is the ideal
option for ensuring scalability and constant availability for application scenarios in which the application performance can
be increased via the so-called "scale out" approach (several instances of the application that are independent of one
another run on several – mostly small – systems). Clients in the LAN see a single IP address, which is used by several
nodes in the cluster. The PRIMECLUSTER Scalability Server is especially suitable for the cross-node scaling of the
computing performance for CPU-intensive applications such as secure web (https), LDAP, etc.
In addition to MS CS, the MS Network Load Balancing Service (NLB) is available: it distributes IP traffic evenly between
PRIMERGY server nodes running instances of the same application and NLB. Typical NLB applications are server farms
running either Web services, terminal services, or e-business applications.
Besides that, HIPLEX AF as well as PRIMECLUSTER provides a command interface enabling the intentional switching
of applications from one node to another (including their software environment). In order to be able to do this, in HIPLEX
AF so-called "switch units", and in PRIMECLUSTER so-called “userApplications” must have been introduced before-
hand.
MS CS allows the definition of Cluster Resource Groups that can be transferred by the administrator from one node to
another at any given time. The same can be done by SW using specific commands.
For distributed applications, see section 1.2.16.
                                                                                                       (Back to Table of Contents)

1.2.18 Can in a high availability cluster be adequately reacted on changes of the workload?
Yes, see section 1.2.17.
                                                                                                       (Back to Table of Contents)

1.2.19 Must the user know on which node the application is currently running?
No. In HIPLEX AF as well as in PRIMECLUSTER the user can address an application by using a unique node address,
or in MS CS the virtual cluster IP address, which are all independent of the location where the application is currently
running.
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 10 of 21

                                                                                                     (Back to Table of Contents)

1.2.20 Which time is necessary for performing a fail-over in some characteristic configura-
       tions?
Generally, the following parameters significantly influence the time necessary for performing a switching:
     The time necessary to detect the failure in case of an unplanned downtime,
     The performance of the CPUs involved,
     The time necessary to switch the peripheral devices,
     The time necessary to unlock the peripheral devices in case of an unplanned downtime,
     The time necessary to establish the execution environment on the target system; examples for this are print and file
      transfer jobs, which were still waiting in some system queues of the former production system,
     The time necessary to stop the application in case of a planned downtime,
     The time necessary to restart the application, and
     The time necessary to perform the recovery of the application.
The time necessary for performing a fail-over depends on the concrete configuration and application. For more infor-
mation please contact the respective person mentioned in section 2.
For the BS2000/OSD platform, Fujitsu Siemens Computers offers a fail-over-calculation-sheet. It enables you to estimate
fail-over times as a function of the essential parameters of the configuration, and can be found in the extranet under
http://bs2mark/Emkahtm/aktuel/april2002/HV-Tabelle-Version-3.03.xls.
                                                                                                     (Back to Table of Contents)



1.3     Sales-related Questions / Central Support by Fujitsu Siemens Computers
1.3.1     What is the added value of HIPLEX AF, PRIMECLUSTER, Reliant Cluster, and MS CS
          for customers?
HIPLEX AF, PRIMECLUSTER, Reliant Cluster, and MS CS offer high availability of the platform for the customers' appli-
cations, together with automatic and rapid failure detection on all hardware and software levels, and automatic and fast
fail-over in case of a detected failure.
                                                                                                     (Back to Table of Contents)

1.3.2     What is the added value of HIPLEX AF, PRIMECLUSTER, Reliant Cluster, and MS CS
          for FSC?
HIPLEX AF, PRIMECLUSTER, Reliant Cluster, and MS CS enable sales regions of Fujitsu Siemens Computers to sell a
leading edge product for achieving high availability for the customers' applications on the respective platform.
                                                                                                     (Back to Table of Contents)

1.3.3     How does a customer get a well-tailored concept for improving the availability of
          her/his configuration?
For this, the customer can use the field-proven professional service of FSC by contacting her/his local sales representa-
tive.
                                                                                                     (Back to Table of Contents)

1.3.4     How are the critical business processes of a customer identified that must be pre-
          served in an emergency situation?
See section 1.3.3.
                                                                                                     (Back to Table of Contents)

1.3.5     How can one detect components that tamper with the availability of an existing con-
          figuration (component failure impact analysis)?
See section 1.3.3.
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 11 of 21

                                                                                                          (Back to Table of Contents)

1.3.6    What expenses have to be taken into account when introducing a high availability
         cluster?
When a customer is about to establish a high availability cluster, the following investment aspects have to be considered:
   Infrastructure costs
    Infrastructure measures in the IT-center such as interruption-free power supply, configuration updates, …
   Product costs
    Hardware (redundancy of the servers, devices, interconnections, …)
    Software (update of the release of the system, necessary and recommended products like
        HIPLEX AF, HIPLEX MSCF, SDF-P, JV, wizards, where appropriate virtual machine system, …
        PRIMECLUSTER product family, wizards, …
        MS Windows 2000 Advanced Server (2-nodes) or Windows 2000 Datacenter Server (4-nodes), or the corre-
        sponding successor products, optionally PRIMERGY DuplexDataManager SW, ...)
   Service costs
    Consultancy costs, project costs for customer-specific implementation of the HA-solution, …
   Running costs
    Service agreements, lease costs, maintenance of the cluster, ...
Note: HA-projects are usually performed with consolidation in view.
This helps to reduce running costs!
                                                                                                          (Back to Table of Contents)

1.3.7    Can the monitoring and fail-over of applications be demonstrated in some repre-
         sentative high availability cluster?
Yes; in case of interest, please contact the respective person mentioned in section 2.
                                                                                                          (Back to Table of Contents)

1.3.8    Are there success stories associated with the introduction of high availability clus-
         tering?
Yes, see sections A1.2.3, A2.2.3 and A3.2.3, respectively.
                                                                                                          (Back to Table of Contents)

1.3.9    What kinds of service contracts comply with high availability requirements?
In order to comply with the high availability requirements, Fujitsu Siemens Computers strongly recommends a high-
quality service contract, for instance the BCC service contract. In case of interest, please contact the local sales repre-
sentative.
                                                                                                          (Back to Table of Contents)

2 Contacts
For further questions and specific information, please use the following contacts:


        Function                   Contact                   Phone                                 Email
Product Management           Georg Fischer           +49 89 636 52207        georg.fischer@fujitsu-siemens.com
HIPLEX AF
Product Management           Jens-Peter Jensen       +49 5251 8 15380        jens-peter.jensen@fujitsu-siemens.com
PRIMECLUSTER
Product Management           Peter Kroll             +49 821 804 2678        peter.kroll@fujitsu-siemens.com
PRIMERGY                                                                     guenther.aust@fujitsu-siemens.com
                             Guenther Aust           +49 821 804 4040
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 12 of 21


                                                                                                                  (Back to Table of Contents)

3 Literature
1) Cluster technology from Fujitsu-Siemens Computers: http://www.fujitsu-siemens.com/rl/products/software/clustertechnology.html
2) Home page of FSC on BS2000/OSD High Availability: http://www.fujitsu-siemens.com/rl/products/bs2000/availability.html
3) Home page of FSC on the PRIMECLUSTER product family: http://www.fujitsu-siemens.com/rl/products/software/primecluster.html
4) Unix cluster technology: http://extranet.fujitsu-siemens.com/com/products_supply/unix/cluster/_cluster
5) PRIMECLUSTER White Paper: http://extranet.fujitsu-siemens.com/com/products_supply/unix/cluster/pcl/_pcl
6) PRIMECLUSTER 4 (Solaris) – Overview: http://extranet2.fujitsu-siemens.com/vil/oec/vil/us000000/i11089/i18752/i18801/i18802.htm
7) CISnet for UNIX infrastructure: http://extranet.fujitsu-siemens.com/com/consulting/UNIX/products/PC.htm
8) PRIMECLUSTER Enterprise Server for mySAP Solutions: http://www.fujitsu-
    siemens.com/rl/products/software/primeclusterenterpriseserver.html
9) PRIMECLUSTER Scalability Server: http://www.fujitsu-siemens.com/rl/products/software/primeclusterscalabilityserver.html
10) Business Continuity – High Availability and Disaster Recovery, General Introduction: http://extranet.fujitsu-
    siemens.com/com/poso/bs2000/consol/hv/ein-hvks2120e.ppt
11) Intranet home page of the development department for high availability, disaster recovery and serviceability – con-
    cepts and products: http://rainbow.mch.fsc.net
12) PRIMECLUSTER sales information: use VIL; search for PRIMECLUSTER
13) HIPLEX AF (BS2000/OSD) – Order Units: http://vil.mch.fsc.net/vil/oec/vil/ms000000/ms732/ms742/ms749.htm
14) HIPLEX AF (BS2000/OSD) – Product Information: http://vil.mch.fsc.net/vil/oec/vil/ms000000/ms732/ms742/ms748.doc
15) HIPLEX AF in the FSC extranet: http://extranet.fujitsu-siemens.com/com/poso/bs2000/consol/hip
16) Tool for estimating fail-over times in HIPLEX AF: http://bs2mark/Emkahtm/aktuel/april2002/HV-Tabelle-Version-3.03.xls
17) VM2000 Virtual Machine System – Manual: http://manuals.fujitsu-siemens.com/servers/bs2_man/man_us.htm
18) BS2000/OSD Management with SNMP: http://extranet.fujitsu-siemens.com/com/poso/bs2000/markarch/brochur/en/snmp-2030e.pdf
19) PRIMERGY HA on the Extranet – see Prince (PRIMERGY Information Center) at
    http://vil.mch.fsc.net/vil/pc/vil/primergy/overview/prince/index_en.htm
    select “High Availability Products” “Related Documents”: different White Papers, slide sets, data sheets are availa-
    ble as well as the
20) Certified PRIMERGY MS CS configurations: see Prince at same URL as before, select “High Availability Products”
    “Related Documents” and the “Status of cluster certifications” or
    in the MS Web at http://www.microsoft.com/hcl/default.asp
21) PRIMERGY Solution Library containing cluster sample configurations ready for FSC System Architect (.sar)
22) Business Critical Computing with PRIMERGY, information in the FSC Internet at http://www.fujitsu-
    siemens.com/rl/products/primergy/businesscritical.html
    in the FSC Extranet – see Prince at http://vil.mch.fsc.net/vil/pc/vil/primergy/overview/prince/index_en.htm select “Business Critical
    Computing”
22) MS Network Load Balancing Service (NLB) http://www.microsoft.com/serviceproviders/whitepapers/network_load_balancing_win2k.asp
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 13 of 21

                                                                                                    (Back to Table of Contents)



Appendix: Platform-specific questions
A1       Questions/Answers considering HIPLEX AF
A1.1 Technical Questions
A1.1.1 What is the functionality of HIPLEX AF?
A good description of the functionality of HIPLEX AF can be found in the brochure "High Availability with BS2000/OSD"
which can be downloaded from http://extranet.fujitsu-siemens.com/com/poso/bs2000/markarch/brochur/en/hverfueg0501_e.pdf.
                                                                                                    (Back to Table of Contents)

A1.1.2 Which are the basic components in a HIPLEX AF cluster?
The basic components comprise hardware and system software components, as well as user procedures.
Hardware:
    Two or more servers (not necessarily symmetrical) in a configuration consisting of native and/or VM2000 systems,
    One server when using several VM2000 guest systems on the server, or
    Different partitions on a server of type SX.
Coupling of these systems takes place by means of MSCF/BCAM communication links between the nodes and, maybe
in addition, by means of a shared public volume set.
System Software:
    HIPLEX AF and HIPLEX MSCF on each node of the cluster
    BS2000/OSD operating system on each node of the cluster
    VM2000 on every node of the cluster that is realized as a VM2000 guest system, especially in case of a virtual
     HIPLEX.
     Virtual HIPLEX:
     In a virtual HIPLEX, the virtual machine system VM2000 is used as basis for the entire cluster. The systems on the
     different nodes are implemented as guest machines of VM2000.
Further information on the implementation of HIPLEX AF can be found in the HIPLEX AF (BS2000/OSD) product infor-
mation under http://vil.mch.fsc.net/vil/oec/vil/ms000000/ms732/ms742/ms748.doc.
Standard User Procedures:
Standard user procedures handle the fail-over of system components like AVAS, FT, SPOOL, and SRDF.
If desired, a customization of these procedures, or the implementation of procedures handling other requests, can be
done in the framework of a project.
                                                                                                    (Back to Table of Contents)

A1.1.3 Which are further useful components for enriching a HIPLEX AF cluster?
Hardware components:
                                         2
Symmetrix Storage Subsystem of EMC for automatic local or remote data mirroring (BCV and SRDF facility, respective-
ly).
The Global store GS can also be used in this environment in order to speed up access to shared data. Fujitsu Siemens
Computers uses the term "parallel HIPLEX" when the nodes in the cluster share a global store GS under control of XCS.
Software components:
SHC-OSD (support of BCV and SRDF), DRV, Concurrent Copy, XCS, XCS-TIME, and HIPLEX-OP.
                                                                                                    (Back to Table of Contents)

A1.1.4 Which general requirements must be fulfilled in order to be able to introduce HIPLEX
       AF?
A high availability cluster based on HIPLEX AF consists of at least two different systems which are connected by HIPLEX
MSCF, and which (in the current implementation) are connected to a common Shared Public Volume Set (SPVS). In
addition, the subsystem JV must be available.
When high availability is to be introduced on one hardware system, partitions on SX models or VM2000 must be used.
                                                                                                    (Back to Table of Contents)
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 14 of 21

A1.1.5 Can different versions of BS2000/OSD and/or different versions of HIPLEX AF be in-
stalled on the nodes of a cluster?
The version of HIPLEX AF must be the same on each node of the cluster. For the BS2000/OSD versions, there are no
restrictions besides the ones mentioned in section A1.1.6.
                                                                                                          (Back to Table of Contents)

A1.1.6 Which BS2000/OSD versions and hardware models are supported?
For HIPLEX AF V2 and BS2000/OSD as of V3.0: no restrictions
For HIPLEX AF V3 and BS2000/OSD as of V4.0: not for SR2000.
                                                                                                          (Back to Table of Contents)

A1.1.7 What is the maximum number of nodes in a HIPLEX AF cluster?
At present, the maximum number of nodes is 16 if VM2000 is not used.
In a virtual HIPLEX (see section A1.1.2), however, for example for /390, the maximum number of nodes is restricted to
15 due to firmware limitations. Depending on the model of the machine, it may be further restricted to 7 (see chapter 2.4
of the VM2000 V7.0A manual under http://manuals.fujitsu-siemens.com/servers/bs2_man/man_us.htm).
                                                                                                          (Back to Table of Contents)

A1.1.8 Can HIPLEX AF distinguish between an actual system failure and a bogus failure?
In case of a bogus failure, the production system is still alive. Such a failure should therefore be ignored; an example for
this may be a failure of the connections to the monitored system.
HIPLEX AF uses HIPLEX MSCF to detect a failure in a monitored system. To make the information more reliable,
HIPLEX MSCF uses different connection paths and different connection variants, which have to be simultaneously dis-
turbed to assert that a failure occurred in the monitored system.
                                                                                                          (Back to Table of Contents)

A1.1.9 What is the amount of performance deterioration when a system is part of a HIPLEX
       AF cluster compared to a single system?
Regularly, monitoring of the nodes in a HIPLEX AF cluster does not hamper the system's overall performance.
A perceptible degradation of the system's performance occurs only during the phase when HIPLEX AF monitoring is
started.
In a virtual HIPLEX (see section A1.1.2), every virtual CPU of the production system that is declared in a guest system
leads to a performance deterioration of at most 1 %.
                                                                                                          (Back to Table of Contents)

A1.1.10     What are the advantages of the HDR-package (High availability and Disaster
       Recovery) and how can one order it?
The High availability and Disaster Recovery (HDR) package comprises (among other topics relating to disaster recovery)
   A considerable acceleration when switching the peripheral devices from the previous work to a standby system,
   Performance improvements in SDF/SDF-P,
   An optimization of the monitoring of the job variables, and
   Internal optimizations in HIPLEX AF.
The HDR package is dispersed over the following product versions:
   BS2000/OSD V5.0,
   HIPLEX AF V3.0,
   SDF V4.5A, and
   SDF-P V2.2A.
The HDR package leads to significant performance amelioration in case of a fail-over. The corresponding figures can be
obtained using the demonstration tool for estimating fail-over times under : http://bs2mark/Emkahtm/aktuel/april2002/HV-Tabelle-
Version-3.03.xls.
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 15 of 21

                                                                                                                        (Back to Table of Contents)

A1.1.11      What are the advantages of the HIPLEX-SDI solution (Software Distribution
       and Installation) and how can one order it?
The HIPLEX-SDI solution helps to minimize the downtimes for planned and unplanned interruptions (see ref. 11).
   By far the main reason for planned interruptions is the installation of software corrections and new product versions
    in a system. By using a functionality of the installation monitor IMON as of its version V2.6, the HIPLEX-SDI solution
    provides a rapid and safe means for distributing these software assets between several systems.
    The distribution of software can be performed in an automatic way.
    It can be used in native systems as well as in systems running under VM2000.
   In systems running under VM2000, the real memory assigned to a guest system and the exclusive access to VM-
    devices can be automatically reconfigured between the guest systems of a VM2000 server when a failing-over has
    taken place.
The HIPLEX-SDI solution is dispersed over the following product versions:
   HIPLEX AF as of V3.0B,
   HIPLEX MSCF as of V5.0,
   IMON as of V2.6, and
   Where applicable VM2000 as of V7.0.
A prerequisite for the HIPLEX-SDI solution is BS2000/OSD as of V5.0. The customer release date of HIPLEX-SDI is
coupled to the correction package I/2003; its order number is U12383-C999.
                                                                                                                        (Back to Table of Contents)

A1.1.12            What is the purpose of SDF procedures (wizards)?
In analogy to wizards, SDF procedures provide a menu-driven interface for adjusting a configuration to special customer
needs. They offer a sound functionality by providing interfaces to all system functions.
                                                                                                                        (Back to Table of Contents)



A1.2 Sales-related Questions / Central Support by Fujitsu Siemens Computers
A1.2.1 Where can one find a list of the HIPLEX AF order units?
See HIPLEX AF (BS2000/OSD) - Order Units, http://vil.mch.fsc.net/vil/oec/vil/ms000000/ms732/ms742/ms749.htm.
                                                                                                                        (Back to Table of Contents)

A1.2.2 How many installations of HIPLEX AF roughly exist at present?
At present, there exist about 15 installations worldwide.
                                                                                                                        (Back to Table of Contents)

A1.2.3 Are there success stories associated with the introduction of HIPLEX AF?
Yes; see the stories listed under   http://extranet.fujitsu-siemens.com/vil/oec/vil/ms000000/i03439/i03440/i03442.htm(Back to Table of Contents)

A1.2.4 What sales training is available?
For an individual offer please contact Mr. Fischer (see section 2).
                                                                                                                        (Back to Table of Contents)

A1.2.5 What technical training is available?
See section A1.2.4.
See also the technical information under http://rainbow.mch.fsc.net/.
                                                                                                                        (Back to Table of Contents)

A1.2.6 How does HIPLEX AF compare with the Automatic Restart Manager (ARM) of IBM’s
Parallel Sysplex?
HIPLEX AF is a high availability monitor that has a number of mutually independent and automatic monitoring functions
for the connected systems, enabling unambiguous and immediate failure detection in order to reduce downtimes in case
of planned and unplanned interruptions, and disaster recovery.
A new software configuration for the operating system and middleware including version upgrades and correction up-
dates may be asynchronously loaded and started on a standby system while the application is still running on the pro-
duction system.
A further highlight is the automatic and rapid fail-over of applications that takes place in case of a failure, thus eliminating
the possibility of errors committed by the operator.
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 16 of 21

With this functionality, HIPLEX AF is equivalent to the Automatic Restart Manager (ARM), the fail-over manager in IBM's
Parallel Sysplex.
                                                                                                                         (Back to Table of Contents)

A2       Questions/Answers considering PRIMECLUSTER
A2.1 Technical Questions
A2.1.1 Where can one find a description of the PRIMECLUSTER functionality?
A description of PRIMECLUSTER’s functionality and structure can be found under                http://extranet.fujitsu-
siemens.com/com/products_supply/unix/cluster/pcl/_pcl.

Sales information is available in VIL; search for PRIMECLUSTER.
For a PRIMECLUSTER 4 (Solaris) overview see        http://extranet2.fujitsu-siemens.com/vil/oec/vil/us000000/i11089/i18752/i18801/i18802.htm
                                                                                                                     (Back to Table of Contents)

A2.1.2 Which assets does the PRIMECLUSTER product family comprise?
The PRIMECLUSTER family of software products currently comprises the following components:
    Reliant Monitor Services RMS 4.0A
     Fail-over Manager and split-brain arbitration module (SCON); it monitors failure situations in applications and the
     system, and switches to a standby system
    Cluster Foundation CF 4.0A
     Common cluster functions like intra-cluster communication (CIP), cluster membership, cluster configuration data-
     base (Cfreg), distributed lock manager (ELM), quorum and additional components including Cluster Administrative
     Java GUI (CAdmin), Shutdown facility (SF), SNMP subagent and MIBs (SNMP), Reliant Cluster Volume Manager
     (RCVM), Reliant Cluster File Share (RCFS)
    Wizard Tools WT 4.0A and Application Wizards AW
     Tools to easily configure and set up fail-over clusters
    Scalable Internet Services SIS 4.0A
     TCP/IP network workload balancing with several selectable distribution algorithms
    Parallel Application Services PAS 4.0A
     Support component for parallel databases like OPS/RAC
    Global File Services GFS 4.0
     Cluster File System Sharing component with cluster-wide name-space and excellent scalability
    Global Disk Services GDS 4.0
     Cluster Volume Manager and Mirroring with cluster-wide manageability
    Global Link Services GLS 4.0
     Network Multipath driver providing continuous transmission in case of network path or card failures.
Please refer to the product-specific Data Sheets and Product Facts of the respective components for a description of
their features and deployment hints.
The following packaging bundles of the components above are available to simplify ordering:

                Packaging Bundle                       CF       RMS       PAS        SIS       WT       AW         GFS         GDS        GLS
      PRIMECLUSTER Enterprise Edition                                                                                                
          PRIMECLUSTER HA Server                                                                                                       
      PRIMECLUSTER Scalability Server                                                
        PRIMECLUSTER Parallel Server                                                                                                    
          PRIMECLUSTER Wizard Kit                                                                         


The following packages are available for upgrading from previous versions of PRIMECLUSTER and Reliant Cluster:
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 17 of 21

                        Package                                CF   RMS   PAS      SIS       WT       AW       GFS       GDS       GLS
          PRIMECLUSTER RMS Package                                                          
          PRIMECLUSTER PAS Package                                                         
          PRIMECLUSTER SIS Package                                                         
              PRIMECLUSTER GFDLS
                                                                                                                                   
              (GFS,GDS,GLS) Package


These products are described on the PRIMECLUSTER home page                http://www.fsc-usa.com/indexfiles/pcl_index.shtml.

For an overview over PRIMECLUSTER 4 (Solaris) see http://extranet2.fujitsu-
siemens.com/vil/oec/vil/us000000/i11089/i18752/i18801/i18802.htm
                                                                                                                  (Back to Table of Contents)

A2.1.3 Which are the basic components in a PRIMECLUSTER cluster?
Every fail-over cluster with PRIMECLUSTER needs at least the components CF, RMS, RMS Wizard Tools and CAdmin.
                                                                                                                  (Back to Table of Contents)

A2.1.4 Which are further useful components for enriching a PRIMECLUSTER cluster?
PAS for parallel application services clusters supports parallel database systems like Oracle RAC.
PRIMECLUSTER also supports TCP/IP based workload balancing by its component SIS. In addition to workload balanc-
ing, SIS supports a certain degree of availability clustering by distributing the workload of a failed node to the remaining
intact nodes. SIS can be combined with the fail-over cluster monitor to provide the capability to monitor the SIS nodes
and applications, and in case of a failure to provide local recovery or switching to a redundant standby system.
The Global Disk Services and Global File Services provide cluster-wide data management capabilities. The Global Link
Services allow workload-balancing and high availability of local network connections.
                                                                                                                  (Back to Table of Contents)

A2.1.5 Which general requirements must be fulfilled in order to be able to introduce
PRIMECLUSTER?
A high availability cluster based on PRIMECLUSTER consists of at least two different systems that are connected by a
redundant private Cluster Interconnect (in most cases based on the Ethernet). A means to eliminate a cluster node by
another cluster node must be available to ensure integrity of common data storage of a cluster in case of node or inter-
connection failures. PRIMECLUSTER supports here many different devices (e.g. cluster console, RCI network (both not
for PRIMERGY), Network Power Switch and Remote Power Switch).
Multi-hosted disks or other storage devices, which are connected to the cluster node, are not required by
PRIMECLUSTER, but are frequently part of a PRIMECLUSTER cluster, because, in most cases, applications share data
across the cluster.
                                                                                                                  (Back to Table of Contents)

A2.1.6 Does FSC offer high availability solutions for standard applications in a
PRIMCLUSTER cluster?
Yes. To speed up implementation of high availability projects and to eliminate the risk of errors, Fujitsu Siemens Com-
puters has defined standard configurations for mySAP solutions based on the PRIMECLUSTER HA Server, different
PRIMEPOWER models as well as Symmetrix and FibreCAT systems (PRIMECLUSTER Enterprise Server for mySAP
Solutions). The definition comprises well-honed concepts for power supply, data backup, disaster recovery, network
infrastructure and SAN-based I/O subsystem.
The resulting configurations have been submitted to extensive global tests, with regard to installation and configuration,
as well as failure and disaster test scenarios. This integration activity is complemented by a tailor-made service offer,
pre-installation of the entire configuration at the factory as well as specific installation and operation instructions.
                                                                                                                  (Back to Table of Contents)

A2.1.7 Can different versions of the operating system and/or different versions of
PRIMECLUSTER be installed on the nodes of a cluster?
All elements of the PRIMECLUSTER product family must have an identical version on each node of the high availability
cluster, respectively. The version of the operating system on each node of the cluster must also be identical.
                                                                                                                  (Back to Table of Contents)

A2.1.8 What is the maximum number of nodes in a PRIMECLUSTER cluster?
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 18 of 21

Technically, the maximum number of nodes is limited to 64.
Due to testing restrictions, up to 4-node clusters are generally available. Everything beyond a 4-node cluster needs a
special release.
                                                                                                                (Back to Table of Contents)

A2.1.9 Can PRIMECLUSTER distinguish between an actual system failure and a bogus fail-
ure?
In case of a bogus failure, the production system is still alive. Such a failure should therefore be ignored; examples for
this may be a failure of the connections to the monitored system, or a failure in the Cluster Interconnect.
Monitoring of systems within a PRIMECLUSTER cluster is realized by a node detection algorithm, which is based on the
Event-Notification-Service (ENS) of CF (see section A2.1.2). A failure in the monitored system is only asserted after a
configurable shutdown agent has explicitly eliminated this system.
                                                                                                                (Back to Table of Contents)

A2.1.10 What is the amount of performance deterioration when a system is part of a
PRIMECLUSTER cluster compared to a single system?
The amount of performance deterioration depends on the actual configuration, especially on the number of objects for
which monitoring has been established, and on the size of the polling intervals.
In standard configurations, the performance degradation is negligible.
                                                                                                                (Back to Table of Contents)

A2.1.11 What is the purpose of RMS Wizard Tools and Application Wizards?
The RMS Wizard Tools and Application Wizards provide a menu-driven interface for adjusting predefined
PRIMECLUSTER configurations to special customer needs. FSC EP SQ approves these configurations.
1.   They provide ease of use, because customers do no longer have to deal with RMS or cluster-internals. The Applica-
     tion Wizards incorporate much application know-how, which otherwise would have to be reinvented over and over
     again.
2.   The fact that these configurations are approved by FSC guarantees best stability.
3.   They improve maintainability, because these configurations are identical on several customer sites.
                                                                                                                (Back to Table of Contents)



A2.2 Sales-related Questions / Central Support by Fujitsu Siemens Computers
A2.2.1 Where can one find a list of the PRIMECLUSTER order units?
See PRIMECLUSTER (Solaris) - Order Units, http://extranet2.fujitsu-siemens.com/vil/oec/vil/us000000/i11089/i18752/i18801/i18803.htm.
                                                                                                                (Back to Table of Contents)

A2.2.2 How many installations of PRIMECLUSTER/Reliant Cluster roughly exist at present?
About 1000 installations.
                                                                                                                (Back to Table of Contents)

A2.2.3 Are there success stories associated with the introduction of PRIMECLUSTER?
Yes; see, for example,
    the real-time linking of IT-centers of the L-Bank (click http://extranet.fujitsu-siemens.com/ and search for “success and
     PRIMECLUSTER and L-Bank”),
    the clustering of 2 PRIMEPOWER servers using PRIMECLUSTER at Southwest Airlines, U.S. (click
     http://www.fujitsu.com/ and search for “success and PRIMEPOWER and Southwest”), or

    the installation of a highly available PRIMECLUSTER configuration with PRIMERGY servers and Linux under
     http://www.fujitsu-siemens.com/sap/pdf/cs_sbs-hr.pdf.
                                                                                                                (Back to Table of Contents)

A2.2.4 What sales training is available?
Sales and technical classes for PRIMECLUSTER are provided by the FSC and SBS training centers in Fuerth and Pa-
derborn, see http://www.sbs.de clicking the buttons Services & Solutions, Consulting & Training, Training & Services, and
online Seminar Program.
Training and TOIs for CC consultants, system engineers and sales trainings are organized by Unix CIS, see
http://extranet.fujitsu-siemens.com/com/consulting/UNIX/products/PC.htm.
                                                                                                                (Back to Table of Contents)
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 19 of 21

A2.2.5 What technical training is available?
See section A2.2.4.
                                                                                                       (Back to Table of Contents)

A2.2.6 How does PRIMECLUSTER compare with other high availability clusters (e.g. SUN
cluster, VERITAS cluster)?
For information in this context please contact Mr. Jensen (see section 2).
                                                                                                       (Back to Table of Contents)


A3       Questions/Answers considering MS Cluster Service MS CS
A3.1 Technical Questions
A3.1.1 Where can one find descriptions of the PRIMERGY High Availability for MS Win-
dows?
Different documents are available in the FSC Extranet and Internet. For URLs please refer to section 3 of this document,
from ref. 19) onwards.
                                                                                                       (Back to Table of Contents)

A3.1.2 Which basic components form an MS CS cluster on PRIMERGY?
Hardware:
    Two or more PRIMERGY servers
    As cluster interconnect: at least one 100 Mbit or Gbit Ethernet controller in each cluster node
    Internal system disk for OS, optionally mirrored by using RAID controllers
    External disk subsystem, attached via SCSI or FC, providing support for Multi initiator SCSI (Multi hosted SCSI).

System Software:
MS Windows 2000 Advanced Server or Windows 2000 Datacenter Server.
MS Windows .NET 2003 Enterprise Server and Windows .NET 2003 Datacenter Server will be the corresponding suc-
cessor products in 2003

Application Software:
Needs to be cluster-released by ISV (provision of the relevant cluster DLL).
                                                                                                       (Back to Table of Contents)

A3.1.3 Which are further useful components for enriching a PRIMERGY MS CS-based clus-
ter?
    ServerView SW suite for server/cluster management including RemoteView software and hardware,
    PRIMERGY MultiPath for support of redundant FC paths to the SAN/storage subsystem,
    PRIMERGY DuplexDataManager for data mirroring between sites for disaster-tolerant data storage (includes Multi-
     Path),
    MS NLB Network Load Balancing Service for support of TCP/IP based workload balancing between servers running
     instances of the same application, like Web servers (NLB is integral part of Windows 2000 Advanced Server and
     Datacenter Server),
    FibreCAT or Symmetrix storage subsystem of EMC² for subsystem-based, automatic local or remote data mirroring,
     and
    Use of LAN Adapter teaming for enhanced availability of the interconnect/heartbeat, e.g. Intel PROset II.
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 20 of 21


                                                                                                     (Back to Table of Contents)

A3.1.4 Which general requirements have to be fulfilled in order to be able to introduce, mar-
ket and deploy MS CS?
Regarding HW and system software requirements see above.
Certification: different levels of MS HW compatibility tests (HCTs) have to be successfully completed in order to be en-
tered into the respective MS HW compatibility list (HCL) and to obtain support by MS in cases where needed:
a) all server HW components including the I/O-controllers and their drivers and the storage subsystem have to be com-
    ponent-certified and listed on their respective component HCLs,
b) the server model itself needs the Windows certification, and
c) finally the entire cluster configuration including the servers, SAN HW and SW and the storage subsystem has to be
    cluster-certified and after that obtains the MS Cluster Logo.
                                                                                                     (Back to Table of Contents)

A3.1.5 Does FSC offer high availability solutions for standard applications in a PRIMERGY
MS CS-based cluster?
In the following you will find a short list of samples for market-relevant applications that are supported on MS CS-based
PRIMERGY clusters, just to name a few:
   SAP R/3 on MS Windows NT4 or Windows 2000
   MS Exchange 2000
   MS SQL Server 2000
   MS Internet Information Server
   MS BizTalk Server
   Oracle 8i DB, Oracle 9i DB
   Siebel SW
   Baan SW.
Worldwide there is a bunch of several thousand ISV-application SW products released and available for use with MS CS.
                                                                                                     (Back to Table of Contents)

A3.1.6 Can different versions of the operating system and/or different versions of the Clus-
ter Manager be installed on the nodes of a cluster?
The version numbers of the Windows OS, MS CS and the applied Windows Service Pack need to be identical on all
cluster nodes.
                                                                                                     (Back to Table of Contents)

A3.1.7 What is the maximum number of nodes in a PRIMERGY MS CS cluster?
a) 2 nodes in MS Windows 2000 Advanced Server (8 nodes under Windows .NET 2003 Enterprise Server as successor
in 2003)
b) Up to 4 nodes in MS Windows 2000 Datacenter Server (8 nodes under Windows .NET 2003 Datacenter Server as
successor in 2003)
c) Up to 32 nodes within a cluster based on MS Network Load Balancing Service (NLB).
                                                                                                     (Back to Table of Contents)

A3.1.8 What is the amount of performance deterioration when a PRIMERGY server is part of
an MS CS cluster?
Normally, monitoring of the MS CS nodes does not significantly degrade the server performance. Depending on the
number of nodes and on the number of resource groups to be monitored by the Cluster Manager, the performance dete-
rioration lies typically in the area of 3 %.
                                                                                                     (Back to Table of Contents)

A3.1.9 What is the purpose of the PRIMERGY BCC configurations (Business Critical Com-
puting) and which advantages do they have?
With its PRIMERGY BCC configurations, FSC offers pre-tested highly available configurations that aim at specific appli-
cation areas:
   Disaster-tolerant MS Exchange server,
   Web server farms for MS Windows and Linux,
   Terminal Server farms based on MS Windows and Citrix MetaFrame.
White Paper • Questions / Answers considering High Availability • Issue January 2003

Page 21 of 21

The advantage for sales and end users is that all those configurations are pre-tested, pre-configured and that sizing-
guides and comprehensive FSC install guides are available ensuring that the customer gets an easy-to-install reliable
cluster configuration for the above application environments. For information about PRIMERGY BCC please see URLs in
section 3.
                                                                                                                       (Back to Table of Contents)



A3.2 Sales-related Questions / Central Support by Fujitsu Siemens Computers
A3.2.1 Where can one find order information for the two MS operating systems that support
MS CS?
The order information can be found either in the FSC price list, or in the Solution Library, for the URL please refer to
section 3.
                                                                                                                       (Back to Table of Contents)

A3.2.2 How many cluster installations of PRIMERGY with MS CS roughly exist at present?
Several 1000. The exact figure is difficult to determine as purchase of the OS including MS CS is done by customers
also via alternate sales channels and not only via FSC.
                                                                                                                       (Back to Table of Contents)

A3.2.3 Are there success stories associated with the introduction of PRIMERGY MS CS
clusters?
Yes, see an example where the user runs SQL 2000 and SAP in an highly available Windows 2000 cluster environment
at http://212.52.235.22/popup.cfm?id=102.
                                                                                                                       (Back to Table of Contents)

A3.2.4 What sales training is available?
There is a complete bunch of sales-oriented training available in the FSC Extranet at
http://my.fsc.net/PortalC_DE/DesktopDefault.aspx?tabid=2434.
                                                                                                                       (Back to Table of Contents)

A3.2.5 What technical training is available?
See section A3.2.4.




 Published by department:                     Delivery subject to availability, specifications subject to   Extranet:
                                              change without notice, correction of errors and omis-         http://extranet.fujitsu-siemens.com/bs2000
 Dr. Hanns-Helmuth Deubler                    sions excepted.
 Telefon: ++49 89 636 47644                   All conditions quoted (TCs) are recommended cost
 Fax:     ++49 89 636 49974                   prices in EURO excl. VAT (unless stated otherwise in
 hanns-helmuth.deubler@fujitsu-siemens.com    the text). All hardware and software names used are
 http://www.fujitsu-siemens.com/bs2000        brand names and/or trademarks of their respective
                                              holders.
                                              Copyright  Fujitsu Siemens Computers, 01/2003

								
To top