HUG7 HBase 0.20 Intro

Description

Slides from the seventh HBase User Group which took place August 7, 2009 at StumbleUpon in San Francisco

Reviews
Shared by: Jonathan Gray
Stats
views:
2516
rating:
not rated
reviews:
0
posted:
8/11/2009
language:
English
pages:
0
HBase 0.20 Primary Goal » First ever Performance Release 1. Random Access Time 2. Scan Time 3. Insert Time » As a random-access store, we are well suited for the storing and serving of Web applications › But high latency and variability (100s of ms to seconds) has reduced the usefulness of HBase and required the use of external caching in the past HBase 0.20 Architecture » The Guiding Philosophy – Unjavafy Everything! › › › › › › Zero-copy reads Block-based storage, reading, and indexing Drastically reduce Object instantiation Eliminate widespread usage of Trees Sorted merges using Heap structures Fast and intelligent caching with memory-awareness » Effort Lead By… › Jonathan Gray and Erik Holstad, Streamy.com › Michael Stack, Powerset/Microsoft › Ryan Rawson, StumbleUpon HBase 0.20 Architecture – Storage » New Key Format – KeyValue › Contains only (byte [] buf, int offset, int length) › Compact binary format with binary comparators › Our “pointer” to keys inside blocks » New File Format – HFile › › › › Originally based on TFile (HADOOP-3315) and BigTable Block based binary format with a block index Contains any number of Meta blocks Persisted storage of List HBase 0.20 Architecture – API » New Query API › › › › Put, Get, Scan, Delete operations Extended support for versioning Drastically reduces API size and complexity An API that more closely mirrors implementation » New Result API and optimized Serialization › › › › Result is just a wrapper for KeyValue[] User-friendly Trees are built on-demand, client-side Deserialization allocates a single byte[] for all KVs Zero-copy building, single allocation receiving HBase 0.20 Architecture – Algorithms » New Scanners – KeyValueScanner / KeyValueHeap › Replace linear sort logic with an encapsulated Heap › Abstract the handling of versions, deletes, query params › Now capable of processing individual rows with millions of columns and versions › Linear (or worse) to Logarithmic, Logarithmic to Constant » New Block Cache - Concurrent LRU › › › › Backed by ConcurrentHashMap LRU eviction with scan-resistance and block priorities Memory-bound using HeapSize interface Non-blocking and unsynchronized LRU map HBase 0.20 By The Numbers (Uncached) » Tall Table: 1 Million Rows with a single Column › Sequential insert – 24 seconds (.024 ms/row) › Random reads – 1.42 ms/row (average) › Full scan – 11 seconds (117 ms/10,000 rows, .011ms/row) » Wide Table: 1000 Rows with 20,000 Columns each › Sequential insert – 312 seconds (312 ms/row) › Random reads – 121 ms/row (average) › Full scan – 146 seconds (14.6 seconds/100 rows, 146ms/row) » Fat Table: 1000 Rows with 10 Columns,1MB values › Sequential insert – 68 seconds (68 ms/row) › Random reads – 56.92 ms/row (average) › Full scan – 35 seconds (3.53 seconds/100 rows, 35ms/row) Each test yielded >1 region, additional rows have no impact on performance HBase 0.20 Performance Conclusion » We surprised even ourselves › Random read times similar to that of an RDBMS • 20-100 times faster with far less variability › Scan times reduced • 30 times faster than previous versions › Insert times reduced • 2-10 times faster with less than half the memory usage » We improved our performance by more than an order of magnitude in most cases › While drastically improving our memory usage and code readability Zookeeper Integration Why? » » » » Takes 2 mins to figure a RegionServer’s death Clients have to ask Master for -ROOT- address Managing shared state in HBase is a zoo ;) And... »Master is a SPOF! Zookeeper? • Project under Hadoop started by Y! • Centralized service for maintaining configuration information, naming, providing distributed synchronization, and group services. • Highly available when used on an ensemble of machines, typically 5 or more. • ZK’s data model is a simple namespace with permanent and ephemeral nodes. Major Integration Points » » » » » » Master address is stored in ZK Master election is a race for that lock -ROOT- address is also stored in ZK Region Servers are all registered in ZK The RSs watch the Master’s node Backup Masters are watching both Master’s node and a “cluster state” node What it Changes for You » Standalone and pseudo-distributed setups: › a ZK server that listens on localhost is started for you. It starts/stops with the rest of the cluster. » Fully-distributed setup: › poss. to keep the managed ZK server but have to make it point on a non-local IP/hostname. › better is to get a quorum, can also use it for other purposes, for higher availability. Fully-distributed setup » What you have to do with ZK: › hbase-env.sh: export HBASE_MANAGES_ZK=true/false › hbase-site.xml: set hbase.cluster.distributed to true, also notice that hbase.master is deprecated. › hbase-site.xml or zoo.cfg: set zookeeper configuration » You want backup masters? › ${HBASE_HOME}/bin/hbase-daemon.sh start master › It’s also a good idea to set hbase.master.dns.nameserver and hbase.master.dns.interface to have them binding at the right place. New Features from ZK integration in 0.20 » No more SPOF › Automatic Master failover » Rolling upgrades of point releases » Modify some cluster configuration without full cluster restart Other 0.20 Goodies » Binary pretty-print in shell/logs/web » Increment Column Value › Fast, atomic increments » New REST Server, Stargate » New MapReduce API › Much cleaner and easier to use › Uses new Hadoop 0.20 API › Accepts a Scan object What’s next? » More performance and reliability › 0.20 was mostly a RegionServer rewrite • But there are still more known bottlenecks left to tackle for 0.21 › 0.21 will rewrite Master with better ZK integration » HBase 0.21 Roadmap › Decentralized Master responsibilities + More ZK • • • • Further capability to modify configurations at run time State sharing via ZK nodes Ephemeral nodes for region ownership Distributed queue for region assignment › Language-agnostic, binary RPC › Native C/C++ client library › Multi-DC Replication

Related docs
HBase User Group 7
Views: 2787  |  Downloads: 18
premium docs
Other docs by Jonathan Gray
HBase Goes Realtime
Views: 3327  |  Downloads: 52
Hadoop and HBase vs RDBMS
Views: 10616  |  Downloads: 354