Enterprise development (in .NET), Day 7
Databases and
Data Access
Andrey Shchekin
http://ashmind.com
St. Petersburg, 2009
Images © Mattahan Productions, Inc | http://mattahan.deviantart.com/ | http://mattahan.insocada.com/
Magic words
RDBMS
Relational
database
management
system
Database features
1. Consistency
2.Performance
3. Scalability
4.Portability
1. Consistency: Constraints
What
A rich system of built-in rules.
Data is required to conform.
Why
Incorrect data is hard to debug.
Fixing live systems can get messy.
Types of Constraints
Data Types
Checks
Unique Keys
Foreign Keys
Triggers
Assertions
Sample Constraints
CREATE TABLE Users (
Id int NOT NULL PRIMARY KEY,
Login varchar(32) NOT NULL UNIQUE,
Age int NOT NULL CHECK (Age > 17),
InvitedById int NOT NULL REFERENCES Users(Id)
)
Transactions
ACID
Atomicity
Consistency
Isolation
Durability
Transaction workflow
Step 1 Step 2 ?
Transactions and debugging
BEGIN TRANSACTION
-- Your debug code here
ROLLBACK TRANSACTION
2. Performance: Indexes
Indexes are quite similar to keys in hash tables.
Clustered
Data in table is ordered by the index.
Non-clustered
3. Scalability
Size
~O(1) selection over millions of rows
~O(1) inserts to large tables
Users
Heavy scaling is mostly outside of our scope.
Suppose it just works with lots of users.
4. Portability
SQL = structured query language
Version Basic Features Advanced Features
SQL-86
SQL-89
JOIN, ALTER TABLE, UNION, Deferrable constraints,
SQL-92
EXCEPT, INTERSECT assertions
SQL:1999 MERGE, CTEs OOP
SQL:2003 MULTISET XML
Most databases support basic features of SQL-92.
5 rules for fast queries
1. Think in sets
2. Know your needs
3. Watch the indexes
4. Experiment
2. Know your needs
Less important
Do not use SELECT *.
More important
Do not ask for 100000 rows if you need 10.
Storing 10000 items in a combo-box is not
a technical question.
Large data –
Do not aggregate in business logic.
Do not sort in business logic.
4. Watch the indexes
SELECT
Login
FROM
dbo.Users
Table scan (very slow)
WHERE
LEFT(Login, 1) = 'a'
SELECT
Login
FROM Index scan (fast)
dbo.Users
WHERE
Login LIKE 'a%'
5. Experiment
Even small changes can affect performance.
Different engines have different optimizations.
Engines
MS SQL Server
Oracle
DB2
MySQL
PostgreSQL
VistaDB
SQLite
MS SQL Server
A Microsoft RDBMS.
Fast, simple, useful.
Should be the first RDBMS to consider under
Windows.
Pro Contra
Great tool support Requires Windows
Great performance Costs money
Great usability (Express is free)
Native XML support Missing useful SQL
Native XQuery support features
MySQL
A GPL/Commercial licensed database server.
Has pluggable engine support.
Has some useful extensions to standard SQL.
http://www.mysql.com/
MySQL: Engines
MyISAM InnoDB
No transactions Transactions
No foreign keys Foreign keys
Table-level locks Fine-grained locks
Pro Contra
Cross-platform Missing critical tools
Free (backup/restore)
Small footprint Lot of configuration
required for optimal
performance
Useful SQL extensions:
REPLACE, INSERT
IGNORE, LIMIT Very slow on index
changes
IBM DB2
An enterprise DB server from IBM.
Cross-platform.
Free version is almost not limited.
http://www.ibm.com/db2
Pro Contra
Cross-platform Terrible tool UI
Useful free version Bad documentation
Fast and powerful Extremely expensive
advanced features
Beyond relational
Full-text search
Supported by a lot of RDBMS.
Implemented as an index text column.
For better full-text in MySQL look at Sphinx.
Full-text search in SQL Server
SELECT
Id,
Title,
Body
FROM
Posts
WHERE
CONTAINS(Body, '"install*" NEAR mysql')
Data warehouses
Large bodies of denormalized data.
Database Reporting Server
OLAP Cube
(Data Warehouse)
Application Server
Other approaches
Scalability
1. Sharding
2. Google BigTable
3. Yahoo Hbase
OODB: db4o
Data Access
Levels of data access
1. Direct
ADO.NET, JDBC
2. O/R: Active Record
Castle ActiveRecord, Rails
LINQ to SQL
3. O/R: Data Mappers
(N)Hibernate,
ADO.NET Entity Framework
Direct Access
string commandText =
"SELECT Login FROM dbo.Users WHERE Id = @Id";
using (var connection = new SqlConnection(connectionString))
using (var command = new SqlCommand(commandText, connection)) {
command.Parameters.AddWithValue("@Id", 5);
using (var reader = command.ExecuteReader()) {
while (reader.Read()) {
// do something with data
}
}
}
O|R
Object Relational
Common features
Reference-collection mapping
Inheritance-table mapping
Table per hierarchy
Table per class
Table per concrete class
Common terms
Lazy Load
Unit of Work
All changes to your objects are
remembered and replayed against the
database.
Identity Map
Same Id == same entity.
Active Record
Database-driven design:
Object ~ database row.
Base class inheritance (ActiveRecordBase).
‘Classic’ (Ruby-like) Active Record requires
explicit lifecycle.
LINQ to SQL
LINQ + ActiveRecord-like O/R mapper.
var ashminds = from user in db.Users
where user.Login == "ashmind"
select new { user.Id, user.Name };
var ashmind = ashminds.Single();
Good database-to-object model generator.
Nearly not testable. Nice for small projects.
Magic words
POCO
Plain
old
CLR
objects
NHibernate
Supports mapping of any POCO.
Completely abstracts storage structure.
Possible mappings:
XML
Attributes
Code
Nhibernate: Sample mapping
ADO.NET Entity Framerwork
Data Mapper (NHibernate-like).
Written by Microsoft.
Good tools, but
1. complex to maintain
2. hard to test
3. missing important features
Questions