C-Store: An Introduction to
School of Software
SUN YAT-SEN UNIVERSITY
Mar. 13, 2009
Overview of Berkeley DB
Means the Berkeley Database
An open-source, embedded transactional data
A key/value store
As a library that is linked with an application
Hides data management from end-user
Scales from Bytes to Petabytes
Runs on everything from cell phone to large
Berkeley DB : Examples of Applications
Store all user and service account information and
Berkeley DB has high reliability and high
Berkeley DB: A Brief History (1)
Began life in 1991 as a dynamic linear
historic UNIX database libraries: dbm, ndbm and
Released as a library in the 4.4 BSD in 1992.
db-1.85 == Hash + B-Tree
The package LIBTP
Transactional Implementation of db-1.85
A research prototype that was never released.
Berkeley DB: A Brief History (2)
In 1996, Seltzer and Bostic started Sleepycat
for use in the Netscape browser
Berkeley DB 2.0, Released in 1997
the first commercial release
Berkeley DB 3.0, Released in 1999
Transformed into an Object-Oriented Handle and
Method style API.
Berkeley DB: A Brief History (3)
Berkeley DB 4.0, Released in 1999
Single-Master, Multiple-Reader Replication
replicas can take over for a failed master
Read-only replicas can reduce master load
Similar ideas are adopted in C-Store.
In Feb. 2006, Oracle acquired Sleepycat.
Sleepycat Public License:
a Dual License
Is open source
And may be downloaded and used freely
However, redistribution requires
Either the package using Berkeley DB be
released as open source
Or that the distributors obtain a commercial
license from Sleepycat (and now Oracle, acquired
in Feb. 2006).
Berkeley DB: Product Family Today
The original Berkeley DB library
Berkeley DB XML
Atop the library
Berkeley DB Java Edition
100% pure Java implementation
Berkeley DB :
Product Family Architecture
Berkeley DB: The Design Philosophy
Provide mechanisms without specifying
For example, Berkeley DB is abstracted as a
store of <key, value> pairs.
Both keys and values are opaque byte-strings.
i.e., Berkeley DB has no schema,
And the application that embeds Berkeley DB is
responsible for imposing its own schema on the
Advantages of <key, value> pairs
An application is free to store data in
whatever form is most natural to it.
Objects (like structures in C language)
Rows in Oracle, SQL Server
Columns in C-store
Different data formats can be stored in the
As long as the application understands how to
interpret the data items.
Indexing Key Values
A record-number-based index implemented atop
Put, store key/value pairs
Get, retrieve key/value pairs
Delete, remove key/value pairs
How Applications Access key/value pairs?
Through handles on databases
Similar to relational tables
Or through cursor handles
Representing a specific place within a database
Used for iteration, i.e., fetch a key/value pair each
Databases are implemented atop OS file
A file may contain one or more databases.
Berkeley DB Replication:
A Log-Shipping System
A Replication Group
A single Master
One or more Read-Only Replicas.
All write operations must be processed
transactionally by the Master
The Master sends log records to each of the
The Replicas apply log records only when
they receive a transaction commit record.
Berkeley DB: Configuration Flexibility
Configuration flexibility is critical
Due to a wide range of applications
Compile Time Configuration
Feature Set Selection
Compile Time Configuration
Option 1: small footprint build
For use in a cell phone
The compiled library contains only B-Tree index,
Omits replication, cryptography, statistics
collection, etc. The library is about 0.5 MB.
Option 2: higher concurrency locking
For use in a Data Center
Lock-Based Concurrency Control
Feature Set Selection
1. The Data Store (DS) feature set
Most similar to the original db-1.85 library
Good for temporary data storage
2. The Concurrent Data Store (CDS) feature set
Acquires a single lock per API invocation
Good for Read-Most applications
3. The Transactional Data Store (TDS) feature set
Currently the most widely used feature set
Acquires a single lock per page
4. The High Availability (HA) feature set
Can continue running even after a site fails.
Index Selection and Tuning
Applications can select the page size in an index
Trading off Durability and Performance
No-force log write
Extreme case: applications can run completely in
Trading off Two-Phase Locking and
Multiversion Concurrency Control.
Note: C-Store adopts similar ideas for high
Challenges of Berkeley DB’s Flexibility
Need flexibility in Berkeley DB designers
Need flexibility in application developers
Any Dream? Any Idea?
Some Research with Me?
M Seltzer . Berkeley DB: A Retrospective.
IEEE Data Engineering Bulletin, Pp. 21-28,
Volume 30, Number 3, September 2007
MA Olson, K Bostic, M Seltzer . Berkeley DB.
USENIX Annual Technical Conference, Pp.
183–192, June 6-11, 1999, Monterey,
Oracle Berkeley DB Site.