V LDB: The Concept
V LDB = Very Very Large Database:
New concept or change to VLDB concept ?
Petabyte tables with 100s billions of rows
Complex table structures
Non-uniform physical data representation of petabyte tables
Well-defined subsets (index and/or partition) on tables: small (~10,000) -> medium
(~300,000) -> large (~1,000,000)
Undefined subsets: very large (~1,000,000,000) -> very very large (~100,000,000,000)
Complex group by’s and sorts
Multiple categories of queries running concurrently (transaction research, analytics, data
Inserts and selects concurrently against the same tables
24 * 7 operation with very limited maintenance windows
SLAs are very strict
V LDB: Problems
Smart partitioning: hash, expression, … -> hybrid multi-level partitioning
Smart partition manipulation: detach / attach partition online
Hash join on petabyte tables ?
Performance Tuning does not work:
Adaptive and buffer-pool aware query optimization ?
System-category aware query optimization ?
Optimizer efficiency ?
Backup/Restore does not work:
Data replication is not a substitute for backup: data corruption, application errors,
Smart backup/restore related to smart data partitioning !
V LDB: Problems
Single database system cannot hold a combination of ODS (> 1 PB) and cross-
functional multi-subject DW (> 200 TB) - it is impractical
Data Abstraction Layer: federated tables partitioned across multiple database systems!
Federated Database is easier to maintain and backup, and availability is higher!
Federated Database Performance = Single Database System Performance !!!