NoSQL Overview Examples, Lots of Data
Relational databases can be used to solve all kinds of problems. Twitter 95 million tweets per day (1100 per second) must be stored.
But are maybe not the right solution to all problems. Only simple queries (based on primary key, no joins). Used
New applications (often web-centric) have new requirements. MySQL earlier, now Cassandra (and more).
Huge amounts of data (terabytes or petabytes) Facebook 500 million active users, half of them log in every day. Each
Simple data structure (often) user has 130 friends (on average). 30 billion pieces of
Must scale well content (links, texts, blog posts, photo albums) accessed
NoSQL = “Not only SQL”. A better name would be “Not only every day. (Cassandra)
relational”. LinkedIn More than 90 million members, one new member every
A mixture of ideas, concepts, tools, products, . . . second. Two billion people searches per year. (Voldemort)
Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 1 / 15 Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 2 / 15
Buy a Bigger Computer Instead? Is It New?
Big computers can store lots of data . . .
Big computers are expensive
Yes: the term NoSQL is from 2009.
And you have to pay big license fees for a big Oracle installation
But NoSQL databases have been around longer than that.
Even big computers can fail
And before anything NoSQL there were object-oriented databases,
Better to use a lot of cheap commodity PC-s
hierarchical databases, network databases, . . .
And replicate data so one or a few failing nodes don’t matter
Design the storage system so it can be expanded (during uptime) by
Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 3 / 15 Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 4 / 15
Diﬀerent Types of Data Stores The CAP Theorem
Key–Value A distributed hash table. Arbitrary key type; the value is a The CAP Theorem says that you cannot have all three of Consistency,
“blob”. The application program must be aware of the Availability. and Partition tolerance.
structure of the value. (Amazon Dynamo) Strong Consistency: all clients see the same version of the data, even
Document As key–value, but the value is a document, and the DBMS on updates to the dataset – e.g. by means of the two-phase commit
knows that. (MongoDB, CouchDB) protocol,
Columns The value is a set of columns, like in a relational database, High Availability: all clients can always ﬁnd at least one copy of the
but they do not necessarily follow a schema. (Google requested data, even if some of the machines in a cluster is down,
BigTable, Cassandra) Partition-tolerance: the total system keeps its characteristic even
Graph The database is a set of nodes with properties, and a set of when being deployed on diﬀerent servers, transparent to the client.
connections between the nodes (with properties). (Neo4J) Many NoSQL systems sacriﬁce consistency and go for BASE (next slide).
Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 5 / 15 Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 6 / 15
No ACID, BASE Instead Amazon Dynamo
Dynamo was developed by Amazon.
Transactions are no longer guaranteed to be ACID: atomic, consistent, First used for the shopping cart, now also for other applications.
isolated, durable). BASE is almost the opposite: basically available, soft Goal: always available, writes never fail.
state, eventually consistent. Key–value store. Records are replicated on several computers.
Read & write: only single records.
BASE is optimistic and accepts that the database consistency is in a state
of ﬂux. “Eventual consistency” (actually more like durability) means that Operations: get(key) returns a value or a list of several versions of a
inaccurate reads are permitted just as long as the data is synchronized value. The application must solve problems with inconsistencies.
“eventually.” (Compare with DNS, it takes time for changes to propagate.) put(key,value) writes a value. The key is hashed, the hash code
determines on which nodes the value should be stored (“consistent
Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 7 / 15 Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 8 / 15
Cassandra Computing Model
First developed by Facebook, now a top-level Apache project. Not only storage should be distributed, but also computing. It is diﬃcult
to write parallel programs . . . MapReduce is a new programming model.
Key–value & replication like in Dynamo.
All data is treated as sets of key–value pairs. The key is a string, the
But the value has structure: it contains columns (which are stored in
value is a blob.
column families which may be stored in super columns). A column
has a name, a value, and a timestamp. Columns may be sorted on All programs are sequences of alternating map and reduce functions.
value or on timestamp. The map function processes a key–value pair and generates one or
Inbox search at Facebook: 50+ TB of data stored on 150 machines. more intermediate key–value pairs.
Term search: the user id is the key. Words in messages are the super The reduce function merges all intermediate values associated with
columns, message id’s become the columns. the same intermediate key.
Interaction search: the user id is the key. Recipient id’s are the super
Map functions run in parallel on many computers, as do reduce
columns, message id’s become the columns.
Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 9 / 15 Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 10 / 15
MapReduce Example MapReduce Data Flow
Compute word counts within a set of documents.
// key: document name
// value: document text
for each word w in value:
// key: a word
// values: a list of counts
result = 0
for each v in values:
result += v
Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 11 / 15 Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 12 / 15
MapReduce Figures (From Google) MapReduce vs Traditional Databases
Execution on a cluster of 1800 machines, 2 × 2GHz processors, 4GB
memory, 320GB disk, Gigabit Ethernet. The ﬁgures are from the original Data has no explicit schema
MapReduce paper, 2004. The map and reduce functions must “understand” the data format.
Users have to write procedural code to interpret and process the data.
Grep Scan through 1010 100-byte records, searching for a A step backwards?
three-character pattern. 150 seconds, including 60 seconds Higher-level programming languages for MapReduce: PIG, Hive.
startup overhead. Data is stored in ﬁles in a distributed ﬁle system.
Sort Sort 1010 100-byte records. 15 minutes. All processing is sort based — makes the programming easier, but
Google Google web search uses an index which is created with may be a performance concern.
Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 13 / 15 Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 14 / 15
Per Holm (Per.Holm@cs.lth.se) Database Technology 2010/11 15 / 15