Embed
Email

Hadoop Hbase-0.20

Document Sample

Description

BigTable non-relational database, is a sparse, distributed, persistent storage of the multi-dimensional sorted Map. Bigtable is designed to reliably handle PB-level data, and can be deployed to thousands of machines. Bigtable has achieved several of the following goals: wide applicability, scalability, high performance and high availability. Bigtable has more than 60 Google products and projects has been applied, including Google Analytics, GoogleFinance, Orkut, Personalized Search, Writely and GoogleEarth. These products are made ??of Bigtable different needs, some need high throughput batch processing, while others require a timely response and rapid return data to the end user. They use the Bigtable cluster configuration is also very different, and some clusters only a few servers, while others require thousands of servers, storage, hundreds of TB of data.

Shared by: Elijah Jimmy
Stats
views:
21
posted:
12/22/2011
language:
pages:
4
Hadoop Hbase-0.20.2 Performance Evaluation



D. Carstoiu, A. Cernian, A. Olteanu

“Politehnica” University of Bucharest

Spl. Independentei 313, 060042

Bucharest, Romania



II. RELATED WORK

Abstract- Hbase is the open source version of BigTable -

distributed storage system developed by Google for the Several reasons justify the choice of storage solutions

management of large volume of structured data. Hbase emulates based on key-value pairs [1]. Some of them are:

most of the functionalities provided by BigTable. Like most non

• Many of the RDBMS do not ensure a decent

SQL database systems, Hbase is written in Java. The current

work’s purpose is to evaluate the performances of the Hbase-

replication and the acquisition of a powerful RDBMS leads to

0.20.2 implementation in comparison with those of the Hbase- excessive costs of licensing;

0.20.0 implementation, and, of course, with the performances • It is necessary to store large volumes of semi-

offered by BigTable. The tests aim at evaluating the structured data;

performances regarding the random writing and random reading • It is a pretext to deal with new languages such as

of rows, how they are affected by increasing the number of Erlang;

servers, by the number of column families and by the system • Data is stored and accessed most often based on a

configuration parameters. primary key;

• Complex join operations are not necessary in

Keywords: Hadoop, Hbase, non SQL database, key, value, processing data;

Distributed Hash Table (DHT)

• The volume of data is very large and the issue raised

I. INTRODUCTION by the management of error scenarios caused by replication

becomes very difficult to handle;

Many modern applications include a database server,

serving multiple web servers accessed by many clients. In this For example, Facebook uses Haystack, thus storing a lot of

case, one may often find that performances are below the data in one file with an independent index, requiring 1M of

expected ones. In this situation, many consider upgrading the metadata for 1G of data [2]. A number of projects have been

hardware, without taking into consideration the database developed as an alternative to RDBMS, some of them more

server. If we take the example of Amazon.com which runs on than a key-value storage. For each of them, there is a number

an Oracle database, optimized and extended, we can consider of main defining characteristics:

that the SQL technology has reached its maximum point of • The implementation language – Java for Voldermort,

scalability. Cassandra, Hbase, Erlang for Ringo, Kai, Scalaris, Dynomite,

It is normal to take into account other approaches. One C (C++) for Hypertable, ThruDB, MemcacheDB;

such good example is Google's approach, using BigTable as • Data model: mostly blob, document oriented or

semi-structured database, which keeps most information from BigTable;

the Internet in cache. Comparing the two approaches leads to • Fault-tolerance based mostly on replication and

the conclusion that traditional SQL database systems, such as partitioning.

Oracle, DB2 and other implementations, are not suitable for a Some of them offer distributed storage facilities based on

certain class of applications. An approach similar to BigTable, key-value pairs with replication facilities. An important issue

from Google, was introduced around the 80’s in operating is the latency with which data is served to populate dynamic

systems through the so-called "hierarchical file system”. pages, especially for web applications. Latency depends on the

Currently, there is a multitude of database systems and environment and on the existence of the required data in

many applications using them. Many bottlenecks of these cache. Generally, we expect data to be available in no more

applications are due to the SQL component, which performs than 10 ms, otherwise cost analysis is needed to improve

very simple tasks in a very complex manner, which fits the performance.

80s computers, but no longer fits the current architectures.

Mainly, large companies developing SQL-based database III. NON SQL DATABASE - HBASE

management systems rely heavily on hardware to ensure the Implementing a secure distributed storage system for large

desired performance. A solution may be distributing the amounts of data must meet some important requirements:

software on multiple machines, in which case the licensing

costs become prohibitive. There is a need for a new approach • Data placement algorithms;

in which a large increase in performance requires • Cache management policies to ensure rapid access to

unsignificant costs and provides a good scalability. data;









84

• Ensure a high degree of reliability in the context of timestamp. A column may not have a value for a particular

data distributed over hundreds or thousands of nodes; private row key. HDFS has a master/slave architecture. A

• Scalability and adequate security measures. cluster is composed of a single NameNode - a server that

manages namespace, files and client access to files [5]. A

It is known that classical database design involves first cluster contains more DataNodes, usually running on one

defining the scheme, and if the application should require any physical node in the cluster, which manages storage space

modification during its evolution, the entire database scheme attached to that node. Through HDFS, user data are stored in

should be redesigned. It is said that data is stored in a database files which are divided into one or more blocks stored in a set

in structured manner, while a distributed storage system of DataNodes. NodeName performs operations on the file

similar to the one proposed by Google through BigTable [3] system, such as: opening, closing, renaming files and folders,

can store large amounts of semi-structured data without mapping blocks to DataNodes.

having to redesign the entire scheme. In this paper, we try to

assess the performances of an open source implementation of DataNode is responsible for serving reading and writing

BigTable, named Hbase, developed using the Java requests from the clients’ file system. DataNode is also in

programming language. charge with creating the blocks, deleting and replicating

according to the instructions received from NameNode [5].

Hbase is an Apache open source project and aims to Both DataNode and NameNode components are software

provide a storage system similar to Bigtable in the Hadoop components designed to run on comodity computers, running

distributed computing environment. Hadoop Distributed File on a Linux operating system. HDFS is written in Java and any

System (HDFS) is a distributed file system structure for machine supporting Java can run the software for NameNode

operating on common hardware structures (commodity and DataNode. The architecture allows more than one

computers) characterized by low cost implementation. DataNode to run on a machine, but in reality this is rarely

Through HDFS, applications can rapidly access data in the used. The existence of a single NameNode simplifies the

context of applications that handle large volumes of data. An architecture. It retains all HDFS metadata and the system is

HDFS instance may consist of hundreds or thousands of constructed so that user data do not cross the NameNode.

machines, each keeping parts of data files. In case of failure, it Decisions related to the replication of blocks are always taken

can be restored automatically. HDFS supports even millions of by NameNode, which receives regularly from each DataNode

files in one instance, agregating a scalable multitude of nodes in the cluster Heartbeat information and the proportion

in the same cluster. The simple consistency model between the blocks used. The information about a file has the

implemented is write-once-read-many. Processing in an form [5]:

application with large amounts of data is more efficient if

executed near where data are stored. This minimizes network NameNode (File_name, replica_numbers,

congestion and increases the system performance. Id_blocs, ...)

HDFS provides interfaces for moving the applications For exemple the next information:

closer to where data is stored. It can be easily ported from one /path/part-0,r:2,{1,3},...

platform to another. From a logical point of view, data in

Hbase are organized in tables, rows and columns. Each /path/part-0,r:3,{2,6,34},..

particular column can have several versions for the same row means that part-0 is stored with 2 replicas on the blocks 1

key. The data model used for Hbase is similar to the one used and 3, and 3 replicas on the blocks 2, 6, and 34. The strategies

by BigTable. The applications keep the rows of data in labeled for replicas place are more important for performance and

tables, each row having a sorting key and an arbitrary number aviability of HDFS.

of columns. Tables are seen as non dense, so the rows of one

table can have a variable number of columns. A column name The usual replication policy is to have two replica

is of the form ": " where and machines from the same rack and a replica for a node located

are an arbitrary string of bytes [4,5]. Upon creation, a in another rack. This policy limits the writing traffic between

table is specified by the set , also called “column racks and the chance of failure of the entire rack is much

families”. Updates on this set are performed through lower than the chance of failure of a node. To minimize

administrative operations. However, a new can be latency in reading, HDFS tries to read data from the nearest

used in any writing operation, without any previous replica, so if there is a replica hosted by the same rack, it will

specification. be preferred. A snapshot of the entire file system namespace

and block map is kept in memory. The format is compact

Hbase stores “column families” physically grouped on the enough and a machine with 4GB RAM supports a large

disk, so the items in a certain column have the same particular number of files and directories. Even in applications with a

read/write characteristics and contain similar data. Only one large volume of data, the volume of metadata is not very large,

row may be blocked by default at a given moment. Writing is so that performance can be very high.

always atomic, but a single row may be locked thus achieving

both reading and writing operations at that time. Recent A major drawback of the implementation is that

versions allow blocking several rows, if the option has been NameNode is a single point of failure in the cluster structure

explicitly activated. and if the machine running the NameNode breaks dows, data

recovery is difficult. Upon creation, a file stores the data

Conceptually, a table in Hbase can be thought of as a locally until its size exceeds the size of a block. At this point,

collection of rows identified by the row key and optionally by







85

NameNode is contacted for inserting the file into the system balance, the client will rescan the META table to determine

hierarchy and allocating data blocks for it. NameNode answers the new location for the user region. If the META region was

to the client’s request with the DataNode identity and the reassigned, the client will rescan the ROOT region to

destination of the data block, and the client will send the data determine the new location for the META region. If the

to specified data node. When the file is closed, the ROOT region was reassigned, the client will contact the

untransported data from the local temporary file will be sent to master to determine new location for the ROOT region and

the destination node and the client announces NameNode that will locate the user region by repeating the process described

the file is closed and completes the creating file transaction. for the initialization.

A possible improvement is the to include in the cluster a V. RESULTS

secondary NameNode to take over tasks when the primary

node has failed. The basic principle is that the secondary node First, tests similar to those presented in [7] were

captures a snapshot of information about the structure of the performed. In the performance analysis for Hbase verion

directories that the secondary node can use together with the 0.20.2, it seems that a single column family was used for one

EditLog file to restore data structure. row. We will perform this test as well. The performance

obtained is are slightly higher, probably due to improvements

IV. TEST SYSTEM ARCHITECTURE of the 0.20.2 version compared to the 0.20.0 version used by

Zhang [7]. In addition, we tried testing with a single region

Tests were performed on Hbase 0.20.2 and on Hadoop server and with 4 region servers. Table I presents the

0.20.0 using java 1.60.x with ssh to remotely manage Hadoop comparison between our results and the results of the tests

daemon. All tables are stored using HDFS. Fot the tests, a conducted by Zhang, first for one region server, then for 4

cluster was established, composed of 4 slave and 1 master to region servers.

keep compatibility with the tests described in [7]. Each

machine 4 CPU cores 2 GHz, 2x300 GB 7200 RPM SATA The analysis of experimental results presented in Table I

drive, 4 GB RAM, 1 Gbps network, all nodes under the same reveals that, considering the same test conditions, the

switch. Tests were performed with tables with more than 4 performances of Hbase-0.20.0 and Hbase 0.20.2 are roughly

million rows, in which keys are represented on 10 digits and similar. A comparison between Hbase-0.20.0 and BigTable is

the values are generated randomly with a length of 1000 byte made in [7, 9]. The number of random reads for a region

each. The total volume of data depends on the number of server and for 4 region servers are very close when we refer to

column families. MapReduce was used for the assessment. values per node.

From the implementation point of view, the Hbase The number of rows random writes increases

architecture has the following major components [6, 10]: unsignificantly if the number of nodes increases. The first

question is why random reads and random writes do not scale

1. HbaseMaster. HBaseMaster is responsible for in a similar way if the number of nodes increases. Note that

assigning regions to HRegionServers. The first assigned random reads increases approximately proportionally to the

region is ROOT region, which locates all META regions to be number of nodes, while random writes remains approximately

assigned. Each META region maps a number of user regions the same when we have a single node, compared to the case of

that contain multiple tables a particular Hbase instance serves. 4 nodes.

A row in ROOT and META table has a size of about 1KB. By

default, a region size is 256 MB, so that ROOT region can A possible explanation comes from the fact that by

map 2.6 x 105 META regions, which map a total of 6.9 x 1010 increasing the number of region servers, we will have more

user regions, approximately 1.8 x 1019 (264) bytes of data. RAM for block cache and will make each region server read

Once all META regions have been assigned, HbaseMaster will blocks from the disk more rarely.

assign user regions to HRegionServers, balansing the number

of regions served by each. TABLE I. SINGLE COLUMN FAMILIES

2. HregionServer is responsible for managing client Type of One region Four region servers Zhang

requests for reading and writing. experiment server number

of

3. Hbase Client. Hbase client is responsible for finding Number of Number Average rows/s

rows/s per of rows/s number of per

particular HregionServers serving sets of rows of interest. node rows/s per

Upon installation, Hbase client communicates with node

node

HBaseMaster to find the location of the ROOT region. This is Random 1296 4608 1152 1106

only a communication between the client and the master. After reads

the ROOT region is located, the client contacts that region Random 9570 11696 2924 2834

server and scans the ROOT region to find the META region, writes

which will contain the location of the user region that contains Initial 8480 10884 2721 2689

the range of rows desired. After locating the user region, the random

writes

client contacts the region server serving that particular region

and provides read or write requests. The client places the Scan 48270 62520 15630 15420

information in cache, so that the following requests do not

have to go through the entire process. When a region is

reassigned, as a result of server failure or in order to load







86

The average reading time for a row is better than that This behavior can be explained by the fact that tables with

offered by the hardware, justified by Block Cache philosophy many rows have a smaller chance of finding an arbitrary

and Hfile implementation. In contrast, a writing operation record in memory, without accessing the disk. Performance

requires access every time to add WAL (Write-ahead log) on degradation is less pronounced when increasing the number of

the disk. This causes a minor increase in random writes servers because of the augmentation of internal memory

performance compared to random reads. available.

It should also be noted that for each random read operation, VI. CONCLUSIONS

it is necessary to transfer an Hfile block from HDFS to a

region server, even if only a small part of the transferred Hbase-0.20.2 performances are substantialy improved over

information is used [7]. previous versions. The random reads have the worst

performance, because each operation requires an Hfile block

The performances heavily depend on the value of the transfer to a region server and only a small part of that

configuration variable value "rows per fetch" [6]. information is used. Thus, if each randomly read row requires

Performances increase with increasing the value of the an Hfile block transfer, the ratio between relevant and read

parameter "rows per fetch". This value should be correlated information is given by the ratio no_bytes_value /

with internal memory capacity. Increased performance is no_bytes_block. Consequently, random read performance will

determined by significantly reducing the number of RPC calls. increase when the number of byte per value is higher. Another

way to increase reading performance is obtained by increasing

Another question we have raised is how performance is

the file system cache. Similarly, sequential writes is faster than

affected by changes in the number of "column families” for

random writes, because fewer RPC packages are used.

Hbase-0.20.2 and the effect of changing the number of region

servers (Table II). A similar test was made by Dana [8] Last, but not least, configuration issues are extremely

without specifying the Hbase version. As it is known from important in terms of performance [7]. We can conclude that

Hbase-0.20.0, performances increase by introducing HFile the Hbase, as an open source alternative to BigTable, is

(new file format) similar to SSTable, new scanners, new Block designed for operating clusters with a reasonable number of

Cache, new compression methods [7]. servers and reasonable data volumes. Performance does not

change substantially when increasing the number of servers. In

The tests were performed with a reasonable number of

the future, we would like to test on a larger number of servers.

column families. Although the documentation specifies that

At this point, we did not have the possibility to perform the

Hbase can manage a large number of column families, the test

tests with dozens of servers.

conducted by Dana [8] shows that a number of column

families close to 1000 PRC leads to timeout before the It is hard to predict the future of distributed databases at

operation is performed. On the other hand, in most practical this time, but we believe that research will focus on

cases, having a few hundreds column families is reasonable. guaranteeing consistency, improving data distribution

Experimently, we have observed that performances are not strategies, maturing failover and recovery algorithms and

significant by the increase of region servers. This can be optimizing data storage.

explained by the fact that a single line is read at a time and

increasing the number of servers does not affect the REFERENCES

performances.

[1] R. Jones, Anti-RDBMS: A list of distributed key-value stores,

A test which determines the number of rows read per http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-

second when increasing the number of rows of the table value-stores/.

stored shows that performances decrease when increasing the [2] J. Sobel, Needle in a Haystack: Efficient Storage of Billions of Photos,

number of rows per table. http://perspectives.mvdirona.com/2008/06/30/FacebookNeedleInAHayst

ackEfficientStorageOfBillionsOfPhotos.aspx.

[3] F. Chang, J. Dean, S. Et al, Bigtable: A Distributed Storage System for

TABLE II. MULTIPLE COLUMN FAMILIES Structured Data, OSDI 2006.

[4] R. Rawson, HBase committer, Hbase,

Type of

1 Region Server 4 Region Server http://docs.thinkfree.com/docs/view.php?dsn=858186

experiment

(Number of column (Number of column [5] Hbase, www.apache.org/hadoop/Hbase/HbaseArchitecture

families) families) [6] A. Khetrapal, V. Ganesh, HBase and Hypertable for large scale

distributed storage, systems: A Performance evaluation for Open Source

1 10 100 1 10 100 BigTable Implementations,

Random 1296 1628 1676 4608 4569 4621 http://www.ankurkhetrapal.com/downloads/HypertableHBaseEval2.pdf

reads [7] A. Rao, S. Zang, Hbase-0.20.0 Performance Evaluation,

http://cloudepr.blogspot.com/2009_08_01_archive.html

Random 9570 14272 16452 11696 17429 19782 [8] K. Dana, Hadoop HBase Performance Evaluation,

writes http://www.cs.duke.edu/~kcd/hadoop/kcd-hadoop-report.pdf

[9] J. Graz, J. D. Crzans, Hbase goes Realtime, The Hbase presentation at

Initial 8480 6428 4572 10884 7846 5674 Hadoop Summit 2009.

random [10] Hbase-0.20.2 Documentation,

writes http://hadoop.apache.org/hbase/docs/r0.20.2.



Scan 48270 52736 53274 62520 67420 68742









87



Related docs
Other docs by Elijah Jimmy
Argos_Game Show Games to Play
Views: 14  |  Downloads: 0
Topside Working Group
Views: 5  |  Downloads: 0
Before 2nd Birthday
Views: 8  |  Downloads: 0
CC - Windows Internet Names Services _WINS_
Views: 3  |  Downloads: 0
Self-Adaptive Two-Dimensional RAID Arrays_1_
Views: 5  |  Downloads: 0
Lines A. - C
Views: 1  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!