In the construction of data processing systems, a database is one of the most
expensive layers. There are two major factors for high cost.
1. Conventional database systems use a scale-up configuration for improving
2. High availability is required because failure of the database often results in
suspension of an entire system service.
As a means to resolve the cost problem, there are a growing number of cases where
a database management system (DBMS) cluster technology is adopted.
For the first factor, a DBMS cluster provides scaling with a number of inexpensive
server machines. In the scale-up configuration, high-specification hardware is
required even if you start with a small system, and components need to be replaced
as the service requirement expands. In a scale-out configuration, you can extend the
system by adding cheaper server machines, which result in the expected
improvement in the Return on Investment (ROI).
The second factor can be addressed by multiplexed servers.
In addition to the above 1 and 2, the storage devices and DBMS software license also
boost the cost of a clustered server.
For reducing the cost of the DBMS software license, the use of open source software
(OSS) can be one of the solutions.
Several OSS-based DBMS cluster products have recently become available. If these
products are applicable, they can cut the cost of developing and operating the data
The criteria for determining what system requirements a specific OSS-based DBMS
cluster product is appropriate for will therefore serve as a significant tool for
developing systems and promoting the introduction of OSS into corporate systems.
With the above discussion background, this project intends to create a guideline that
will serve as the criteria for evaluating DBMS clusters. It also aims to provide the
procedure for quantitative assessment and tries to apply the procedure to real
OSS-based DBMS cluster products.
The evaluation criteria includes items that are significant and useful for actual design
and operation of a DBMS cluster. However, all assessment items were not always
applied in every case because some assessment items do not fit well with the
construction or limitation of certain DBMS clusters.
We have also prepared some functions that the conventional evaluation tools lack by
modifying existing features or creating ones from scratch to reproduce a real-life
environment for evaluating OSS-based DBMS cluster products.
The results of the evaluation are planned to be published with the intention of
promoting OSS use. The evaluation procedure will be refined to serve as a guideline
for similar efforts in the future.
1.2 Range of the Guideline
There are many kinds of DBMS cluster systems. Not all systems provide resolutions
for both of the two high cost factors listed above.
The types of DBMS clusters will be discussed in 2.1 “Types of DBMS Clusters.” This
project narrowed the target to the clusters that offer resolutions for both of the above
factors and are available for general uses. Such specification of the target enabled us
to set the appropriate range of the evaluation criteria.
The evaluation therefore targeted OSS-based DBMS cluster products that satisfy the
above conditions. The specific implementation of each product is covered in the
description of each criterion.
The evaluation handled major DBMS cluster products that were available as open
source products. We selected the products that supported the availability function for
high reliability, offered practical functions for actual operations, and for which steady
community activities could be expected.
We picked PGCluster and MySQL Cluster.
Table 1.2-1: Document Configuration
1 Overall description of the document (General
2 DBMS Cluster Evaluation Criteria
3 Criteria for PGCluster
4 pgbench (Description of evaluation tool)
5 Results of Evaluating PGCluster
6 Criteria for MySQL Cluster
7 Results of Evaluating MySQL Cluster
1.3 Overview of Evaluation
1.3.1 Performance and Reliability Criteria in a Cluster Environment
We defined a guideline that can be used as the criteria for evaluating DBMS cluster
products and apply that guideline for qualitative evaluation. The evaluation criteria
include items that are significant and useful for actual design and operation of a
The criteria were fixed so that they covered the items that could form the system
specification. We think that the system specification consists of six groups of
1. Basic Operation
The criteria and viewpoints for each requirement are described in Chapter 2 “ DBMS
Cluster Evaluation Criteria.” Chapter 3 covers the evaluation items and policy for
PGCluster that are based on the general criteria and Chapter 6 covers those for
1.3.2 Creation of Evaluation Tools
We created tools for evaluating DBMS cluster products from scratch and by modifying
18.104.22.168 PGCluster: Modifying pgbench
We developed an evaluation tool that measures the performance characteristics by
changing the contents of transactions (e.g., ratio of reference to update transactions).
22.214.171.124 MySQL Cluster: Creating mBench
We created a benchmark tool that issues SQL queries to several SQL nodes
1.3.3 Evaluating Open Source DBMS
126.96.36.199 Evaluating PGCluster
We evaluated the functions and performance of PGCluster, and defined the
evaluation procedure. We used the “pgbench” benchmark tool we created in 188.8.131.52.
Database and Related Products: PGCluster 1.0, 1.1, and 1.3
Operating System: Red Hat Enterprise Linux AS release 4
Performance Evaluation Tool: pgbench (modified version)
184.108.40.206 Evaluating MySQL Cluster
We evaluated different MySQL Cluster configurations (e.g., node and network) and
defined the evaluation procedure. We used the mBench benchmark tool we created in
220.127.116.11. The result of the evaluation will be used for planning a preliminary verification
and building the pointing system for the full scale cluster evaluation.
Database and Related products: MySQL 4.1 MySQL Cluster
Operating System: MIRACLE LINUX V3.0 and V3.0 for x86-64
Performance Evaluation Tool: mBench