equivalent in power to relational algebra will later allow us to consider queries

Document Sample

```					                           Datalog
Another formalism for expressing queries:

- cleaner
- closer to a “logic” notation
- more convenient for analysis
- equivalent in power to relational algebra
- will later allow us to consider queries with recursion
Predicates and Atoms
- relations are represented by predicates
- tuples are represented by atoms.

Purchase( “joe”, “bob”, “Nike Town”, “Nike Air”, 2/2/98)

- arithmetic atoms:

X < 100,   X+Y+5 > Z/2

- negated atoms:

NOT Product(“Brooklyn Bridge”, \$100, “Microsoft”)
Datalog Rules and Queries
A datalog rule has the following form:

head :- atom1, atom2, …., atom,…

ExpensiveProduct(X) :- Product(X,Y,P) & P > \$100

BritishProduct(X) :- Product(X,Y,P) & Company(P, “UK”, SP)

P(X,Y) :- Between(X,Y,Z) & NOT Direct(X,Z)
The Meaning of Datalog Rules
ExpensiveProduct(X) :- Product(X,Y,P) & P > \$100

Consider every assignment from the variables in the body
to the constants in the database.

If each of the atoms in the body is made true by the assignment,

then

Rule Safety
Every variable that appears anywhere in the query must appear
also in a relational, nonnegated atom in the query.

Q(X,Y,Z) :- R1(X,Y) & X < Z              not safe

Q(X,Y,Z) :- R1(X,Y) & NOT R2(X,Y,Z)          not safe
Composing Datalog Rules
Extensional predicates: represent relations appearing in the database.

Intentional predicates: defined by rules. These can be thought of as
being views.

Datalog rules may be composed in order to express more complex
queries.
An Example Query
Find employees participating in projects that don’t involve

EmpInvolve ( X, P, H) :- Project(P,X, S, E, B, D) &
Employee( X, N ) &
Department( N, H)

DHInvolve ( X, P, H)    :- Project( P, H, S, E, B, D) &
Department( N, H) &
Employee( P, N )

Answer (X) :-         EmpInvolve(X, P, H) &
NOT DHInvolve( X, P, H).
From Relational Algebra to
Datalog
We can translate any relational algebra operation to datalog:

- projection

- selection

- union

- intersection

- join
Architecture of a DBMS
Query optimization and
execution

Relational operators

Files and access methods

Buffer management

Disk space management
The Memory Hierarchy
Main Memory           Disks                     Tapes

•Volatile          • 5-10 MB/S              • 1.5 MB/S transfer rate
•limited address     transmission rates     • 280 GB typical
spaces             • 2-10 GB storage        capacity
• expensive        • average time to        • Only sequential access
• average access    access a block:         • Not for operational
time:            10-15 msecs.             data
5 microseconds   • Need to consider
seek, rotation,
transfer times.
• Keep records “close”
to each other.
Disk Space Manager
Task: manage the location of pages on disk (page = block)

Provides commands for:

• allocating and deallocating a page on disk

Why not use the operating system for this task?

• Portability
• Limited size of address space
• May need to span several disk devices.
Buffer Manager
Manages buffer pool: the pool provides space for a limited
number of pages from disk.

Needs to decide on page replacement policy.

Enables the higher levels of the DBMS to assume that the
needed data is in main memory.

Why not use the Operating System for the task??

- DBMS may be able to anticipate access patterns
- Hence, may also be able to perform prefetching
- DBMS needs the ability to force pages to disk.
Managing Files
The abstraction used by the higher levels of the DBMS is of files.

For example, a relation is stored in a file.

A file will typically consist of many pages.

Main issue: how to organize the records in a file?

Approaches:

•   heap files
•   ordered files
•   hashed files

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 6 posted: 2/28/2012 language: pages: 13
How are you planning on using Docstoc?