HBase at Hadoop World NYC

Reviews
Shared by: ryanobjc
Stats
views:
92
rating:
not rated
reviews:
0
posted:
10/2/2009
language:
English
pages:
0
HBase, Hadoop World NYC Ryan Rawson, Stumbleupon.com, su.pr Jonathan Gray, Streamy.com Friday, October 2, 2009 A presentation in 2 parts Friday, October 2, 2009 Part 1 Friday, October 2, 2009 About Me • Ryan Rawson • Senior Software Developer @ Stumbleupon • HBase committer, core contributor Friday, October 2, 2009 Stumbleupon • Uses HBase in production • Behind features of our su.pr service • More later Friday, October 2, 2009 Adventures with MySQL • Scaling MySQL hard, Oracle expensive (and hard) • Machine cost goes up faster speed • Turn off all relational features to scale • Turn off secondary (!) indexes too! (!!) Friday, October 2, 2009 MySQL problems cont. • Tables can be a problem at sizes as low as 500GB • Hard to read data quickly at these sizes • Future doesn’t look so bright as we contemplate 10x sizes • MySQL master becomes a problem... Friday, October 2, 2009 Limitations of masters • What if your write speed is greater than a single machine? • All slaves must have same write capacity as master (can’t cheap out on slaves) • Single point of failure, no easy failover • Can (sort of) solve this with sharding... Friday, October 2, 2009 Sharding Friday, October 2, 2009 Sharding problems • Requires either a hashing function or mapping table to determine shard • Data access code becomes complex • What if shard sizes become too large... Friday, October 2, 2009 Resharding! Friday, October 2, 2009 What about schema changes? • What about schema changes or migrations? • MySQL not your friend here • Only gets harder with more data Friday, October 2, 2009 HBase to the rescue • Clustered, commodity(ish) hardware • Mostly schema-less • Dynamic distribution • Spreads writes out over the cluster Friday, October 2, 2009 What is HBase? • HBase is an open-source distributed • Part of the Hadoop ecosystem • Layers on HDFS for storage • Native connections to map reduce Friday, October 2, 2009 database, inspired by Google’s bigtable HBase storage model • Column-oriented database • Column name is arbitrary data, can have • Rows stored in sorted order • Can random read and write Friday, October 2, 2009 large, variable, number of columns per row Friday, October 2, 2009 Friday, October 2, 2009 Tables • Table is split into roughly equal sized “regions” • Each region is a contiguous range of keys, from [start, to end) • Regions split as they grow, thus dynamically adjusting to your data set Friday, October 2, 2009 Server architecture • Similar to HDFS: • Master = Namenode (ish) • Regionserver = Datanode (ish) • Often run these alongside each other! Friday, October 2, 2009 Server Architecture 2 • But not quite the same, HBase stores state in HDFS • HDFS provides robust data storage across machines, insulating against failure and machine independent • Master and Regionserver fairly stateless Friday, October 2, 2009 Region assignment • Each region from every table is assigned to a Regionserver • The master is responsible for assignment and noticing if (when!) regionservers go down Friday, October 2, 2009 Master Duties • When machines fail, move regions from affected machines to others balance cluster • When regions split, move regions to • Could move regions to respond to load • Can run multiple backup masters Friday, October 2, 2009 What Master does NOT do • Does not handle any write requests (not a DB master!) • Does not handle location finding requests • Not involved in the read/write path! • Generally does very little most the time Friday, October 2, 2009 Distributed coordination • To manage master election and server availability we use ZooKeeper coordination primitives management systems • Set up as a cluster, provides distributed • An excellent tool for building cluster Friday, October 2, 2009 Scaling HBase • Add more machines to scale • Base model (bigtable) scales past 1000TB • No inherent reason why HBase couldn’t Friday, October 2, 2009 What to store in HBase? • Maybe not your raw log data... Friday, October 2, 2009 • ... but the results of processing it with Hadoop! • By storing the refined version in HBase, can keep up with huge data demands and serve to your website Friday, October 2, 2009 HBase & Hadoop • Provides a real time, structured storage layer that integrates on your existing Hadoop clusters reduce. • Provides “out of the box” hookups to map• Uses the same loved (or hated) management model as Hadoop Friday, October 2, 2009 HBase @ Friday, October 2, 2009 Stumbleupon & HBase • Started investigating the field in Jan ’09 • Looked at 3 top (at the time) choices: • Cassandra • Hypertable • HBase Friday, October 2, 2009 cassandra didnt work, didnt like data model - hypertable fast but community and project viability (no major users beyond zvents) - hbase local and good community Stumbleupon & HBase • Picked HBase: • Community • Features • Map-reduce, cascading, etc • Now highly involved and invested Friday, October 2, 2009 su.pr marketing • “Su.pr is the only URL shortener that also helps your content get discovered! Every Su.pr URL exposes your content to StumbleUpon's nearly 8 million users!” Friday, October 2, 2009 su.pr tech features • Real time stats • Done directly in HBase • In depth stats • Use cascading, map reduce and put results in hbase Friday, October 2, 2009 su.pr web access • Using thrift gateway, php code accesses HBase • No additional caching other than what HBase provides Friday, October 2, 2009 Large data storage • Over 9 billion rows and 1300 GB in HBase • Can map reduce a 700GB table in ~ 20 min • That is about 6 million rows/sec • Scales to 2x that speed on 2x the hardware Friday, October 2, 2009 Micro read benches • Single reads are 1-10ms depending on disk seeks and caching dozens of ms • Scans can return hundreds of rows in Friday, October 2, 2009 Serial read speeds • A small table • A bigger table • (removed printlns from the code) Friday, October 2, 2009 Deployment considerations • Zookeeper requires IO to complete ops • Consider hosting on dedicated machines • Namenode and HBase master can co-exist Friday, October 2, 2009 What to put on your nodes • Regionserver requires 2-4 cores and 3gb+ • Can’t run HDFS, HBase, maps, reduces on a 2 core system • On my 8 core systems I run datanode, regionserver, 2 maps, 2 reduces Friday, October 2, 2009 Garbage collection • GC tuning becomes important. • Quick tip: use CMS, use -Xmx4000m • Interested in G1 (if it ever stops crashing) Friday, October 2, 2009 Batch and interactive • These may not be compatible • Latency goes up with heavy batch load • May need to use 2 clusters to ensure responsive website Friday, October 2, 2009 Part 2 Friday, October 2, 2009 HBase @ Streamy • History of Data • RDBMS Issues • HBase to the Rescue • Streamy Today and Tomorrow • Future of HBase Friday, October 2, 2009 About Me • Co-Founder and CTO of Streamy.com • HBase Committer • Migrated Streamy from RDBMS to HBase and Hadoop in June 2008 Friday, October 2, 2009 History of Data The Prototype • Streamy 1.0 built on PostgreSQL ‣ All of the bells and whistles • Powered by single low-spec node ‣ 8 core / 8 GB / 2TB / $4k Functionally powerful, Woefully slow Friday, October 2, 2009 History of Data The Alpha • Streamy 1.5 built on optimized PostgreSQL ‣ Remove bells and whistles, add partitioning • Powered by high-powered master node ‣ 16 core / 64 GB / 15x146GB 15k RPM / $40k Less powerful, still slow... Insanely expensive Friday, October 2, 2009 History of Data The Beta • Streamy 2.0 built entirely on HBase ‣ Custom caches, query engines, and API • Powered by 10 low-spec nodes ‣ 4 core / 4GB / 1TB / $10k for entire cluster Less functional but fast, scalable, and cheap Friday, October 2, 2009 RDBMS Issues • Poor disk usage patterns • Black box query engine • Write speed degrades with table size • Transactions/MVCC unnecessary overhead • Expensive Friday, October 2, 2009 The Read Problem • View 30 newest unread stories from blogs ‣ Not RDBMS friendly, no early-out ‣ PL/Python heap-merge hack helped ‣ We knew what to do but DB didn’t listen Friday, October 2, 2009 The Write Problem • Rapidly growing items table ‣ Crawl index from 1k to 100k feeds ‣ Indexes, static content, dynamic statistics ‣ Solutions are imperfect Friday, October 2, 2009 RDBMS Conclusions • Enormous functionality and flexibility ‣ But you throw it out the door at scale • Stripped down RDBMS still not attractive • Turned entire team into DBAs • Gets in the way of domain-specific optimizations Friday, October 2, 2009 What We Wanted • Transparent partitioning • Transparent distribution • Fast random writes • Good data locality • Fast random reads Friday, October 2, 2009 What We Got • Transparent partitioning • Transparent distribution • Fast random writes • Good data locality • Fast random reads Friday, October 2, 2009 Regions RegionServers MemStore Column Families HBase 0.20 What Else We Got • Transparent replication • High availability • MapReduce • Versioning • Fast Sequential Reads Friday, October 2, 2009 HDFS No SPOF Input/OutputFormats Column Versions Scanners HBase @ Streamy Today Friday, October 2, 2009 HBase @ Streamy Today • All data stored in HBase • Additional caching of hot data • Query and indexing engines • MapReduce crawling and analytics • Zookeeper/Katta/Lucene Friday, October 2, 2009 HBase @ Streamy Tomorrow • Thumbnail media server • Slave replication for Backup/DR • More Cascading • Better Katta integration • Realtime MapReduce Friday, October 2, 2009 HBase on a Budget • HBase works on cheap nodes ‣ But you need a cluster (5+ nodes) ‣ $10k cluster has 10X capacity of $40k node • Multiple instances on a single cluster • 24/7 clusters + bandwidth != EC2 Friday, October 2, 2009 Lessons Learned • Layer of abstraction helps tremendously ‣ Internal Streamy Data API ‣ Storage of serialized types • Schema design is about reads not writes • What’s good for HBase is good for Streamy Friday, October 2, 2009 What’s Next for HBase • Inter-cluster / Inter-DC replication ‣ Slave and Multi-Master • Master rewrite, more Zookeeper • Batch operations, HDFS uploader • No more data loss ‣ Need HDFS appends Friday, October 2, 2009 HBase Information • Home Page http://hbase.org • Wiki http://wiki.apache.org/hadoop/Hbase • Twitter http://twitter.com/hbase • Freenode IRC #hbase • Mailing List hbase-user@hadoop.apache.org Friday, October 2, 2009

Related docs
HBase at Hadoop World NYC
Views: 2687  |  Downloads: 54
Hadoop and HBase vs RDBMS
Views: 11304  |  Downloads: 383
HBase @ WorldLingo
Views: 2787  |  Downloads: 53
HBase Goes Realtime
Views: 3894  |  Downloads: 60
HUG7 HBase 0.20 Intro
Views: 2936  |  Downloads: 20
HBase nosql presentation
Views: 1098  |  Downloads: 34
Introduction to Hadoop
Views: 46  |  Downloads: 10
HBase User Group 7
Views: 3178  |  Downloads: 19
CSED421 Database Systems Lab
Views: 0  |  Downloads: 0
Chemical Bonding Lab
Views: 122  |  Downloads: 1
Studied Grid Technologies
Views: 1  |  Downloads: 0
Construct Lewis Dot Diagrams
Views: 35  |  Downloads: 0
premium docs
Other docs by ryanobjc
HBase at Hadoop World NYC
Views: 2686  |  Downloads: 54
HBase nosql presentation
Views: 1098  |  Downloads: 34
HBase User Group 7
Views: 3177  |  Downloads: 19