# Datalog Deductive Databases by mercy2beans108

VIEWS: 22 PAGES: 14

• pg 1
```									    Datalog & Deductive Databases

Chapter 24: Sections 24.1 – 24.4

Databases: The Continuing Saga
When last we left databases…
We had decided they were great things
We knew how to conceptually model them in ER
diagrams
We knew how to logically model them in the relational
model
We could normalize them
We could formally model queries using
Relational algebra – relational approach
Relational calculus – logic based approach
We could query them using sql – the relational approach
What about an actual usable logic based language?

1
Datalog
Our final relational query language
Based on logic notation (Prolog)
Can express queries that are not
expressible in relational algebra or
standard SQL (recursion).
Uses sets
No grouping and aggregation, orderby.
Cleaner     convenient for analysis

Predicates and Atoms
- Relations are represented by predicates
- Tuples are represented by atoms.
Purchase( “joe”, “bob”, “Nike Town”, “Nike Air”)

-Arithmetic comparison atoms:
X < 100, X+Y+5 > Z/2

- Negated atoms:
NOT Product(“Brooklyn Bridge”, \$100, “Microsoft”)

2
Datalog Definitions
A datalog rule:

atom :- atom1 , … , atomn

body              preceded by NOT
E.g.:
S(n) :- Person(s,n,p,t,c), Purchase (b,s,“Gizmo Store”,i)

Datalog program = a collection of rules
A single rule can express exactly select-from-where queries.

The Meaning of Datalog Rules

Consider every assignment from the variables in the
body to the constants in the database. (same variable
name means requires same value)

If each of the atoms in the body is in the database,
then the tuple for the head is in the result.

3
Datalog Example 1
Product ( pid, name, price, category, maker-cid)
Company (cid, name, stock price, country)
Person(sin, name, phone number, city)

S(i,n,p,c,m) :- Product(i,n,p,c,m), p>99.99

a selection: σprice>99.99(Product)

Datalog Example 2
Product ( pid, name, price, category, maker-cid)
Company (cid, name, stock price, country)
Person(sin, name, phone number, city)

S(n) :- Product(i,n,p,c,m), p>99.99

a selection-projection: Πname(σprice>99.99(Product))

4
Datalog Example 3
Product ( pid, name, price, category, maker-cid)
Company (cid, name, stock price, country)
Person(sin, name, phone number, city)

Find store names where Fred bought:
S(n) :- Person(s, “Fred”,t,c), Purchase(s,l,n,p)

a selection-projection-join:

Exercise part 1
Product ( pid, name, price, category, maker-cid)
Company (cid, name, stock price, country)
Person(sin, name, phone number, city)

Ex #1: Find SINs of people who bought products
in the telephony category.

Ex #2: Find the sin of people who bought Canadian products

5
Anonymous Variables
Product ( pid, name, price, category, maker-cid)
Company (cid, name, stock price, country)
Person(sin, name, phone number, city)

Find names of people who bought from “Gizmo Store”

E.g.:
S(n) :- Person(s, n, _, _, _) AND Purchase (_, s, “Gizmo Store”, _)

Each _ means a fresh, new variable
Very useful: makes Datalog even easier to read

Exercise part 2
Product ( pid, name, price, category, maker-cid)
Company (cid, name, stock price, country)
Person(sin, name, phone number, city)

Ex #3: Find names of people who bought Canadian products that
cost under 50

6
Multiple Datalog Rules
Product ( pid, name, price, category, maker-cid)
Company (cid, name, stock price, country)
Person(sin, name, phone number, city)

Find names of buyers and sellers:

A(n) :- Person(s,n,a,b), Purchase(s,c,d,e)
A(n) :- Person(s,n,a,b), Purchase(c,s,d,e)

Multiple rules correspond to union

Exercise part 3
Product ( pid, name, price, category, maker-cid)
Company (cid, name, stock price, country)
Person(sin, name, phone number, city)

Ex #4: Find sins of people who bought stuff from Joe or bought
products from a company whose stock prices is more
than \$50.

7
More Examples (negation, union)
not Purchase(buyer, seller, product, “The Bay”)

Product(prod, price, cat, maker)
Company(maker, sp, country),
sp > 50.

Rule Safety

Every variable that appears anywhere in the
query must appear also in a relational, non-negated
atom in the query.

Q(X,Y,Z) :- R1(X,Y), X < Z              not safe

Q(X,Y,Z) :- R1(X,Y), NOT R2(X,Y,Z)         not safe

The same rule is in Domain Relational Calculus

8
Exercise part 4
Product ( pid, name, price, category, maker-cid)
Company (cid, name, stock price, country)
Person(sin, name, phone number, city)

Ex #5: Find the sins of people who bought Canadian products
and did not buy French products

Stratification

T depends on S if some rule with T in the
head contains S or (recursively) some atom
that depends on S, in the body.
Stratified program: If T depends on not S,
then S cannot depend on T (or not T).
If a program is stratified, the tables in the
program can be partitioned into strata:
Stratum 0: All database tables.
Stratum I: Tables defined positively in terms of
tables in Stratum I and positively or negatively
over lower strata.
If T depends on not S, S is in lower stratum than
T.

9
Stratification Examples
Given: Assembly(Part, Subpart, Quantity)
Find set of parts that use at least 3 copies of some subpart
Stratified:
Big(Part) :- Assembly(Part, Subpart, Qty), Qty > 2
Small(Part):-Assembly(Part, Subpart, Qty), NOT Big(Part)

Not stratified:
Big(Part) :- Assembly(Part, Subpart, Qty), Qty > 2, NOT
Small(Part)
Small(Part):-Assembly(Part, Subpart, Qty), NOT Big(Part)

Defining Views
not Purchase(buyer, seller, product, “The Bay”)

Product(prod, price, cat, maker)
Company(maker, sp, country),
sp > 50.

10
Taking it to the next level
Say you’re planning a beach vacation
And you wanted to find if it’s possible to get
from YVR to OGG (that’s on Maui)

Flight(airline,num,origin,destination)

Now what?

Another Recursion Example:
Transitive Closure
Suppose we represent a graph w/ relation Edge(X,Y):

Edge(a,b), Edge (a,c), Edge(b,d), Edge(c,d), Edge(d,e)

b

a                   d          e
c

How can I express the query:
Find all nodes reachable from a.

11
Transitive Closure Solved
Path(X, Y) :- Edge(X, Y)
Path(X, Y) :- Path(X, Z), Path(Z, Y).

A query is recursive if there is a cycle in the
dependency graph of the atoms.

Evaluating Recursive Queries
Path(X, Y) :- Edge(X, Y)
Path(X, Y) :- Path(X, Z), Path(Z, Y).
Semantics: evaluate the rules until a fixed point:
Iteration #0: Edge: {(a,b), (a,c), (b,d), (c,d), (d,e)}
Path: {}
Iteration #1: Path: {(a,b), (a,c), (b,d), (c,d), (d,e)}
Iteration #2: Path gets the new tuples:
(a,d), (b,e), (c,e)
Iteration #3: Path gets the new tuple:
(a,e)
Iteration #4: Nothing changes        Stop.
Note: # of iterations depends on the data. Cannot be
anticipated by only looking at the query!

12
More examples
Given:
Movie(id, title)
Actor(id, name)
Role(movie-id, actor-id, character)
Find names of actors who have “Bacon
numbers” (assume there’s only one “Kevin
Bacon”)

Recursive SQL? Sometimes…
Given: Assembly(Part, Subpart, Quantity)
Find: all recursive components of an item
Datalog:
Comp(Part, Subpt) :- Assembly(Part, Subpt, Qty).
Comp(Part, Subpt) :- Assembly(Part, Part2, Qty), Comp(Part2, Subpt).

SQL:
WITH RECURSIVE Comp(Part, Subpt) AS
(SELECT A1.Part, A1.Subpt FROM Assembly A1)
UNION
(SELECT A2.Part, C1.Subpt
FROM Assembly A2, Comp C1
WHERE A2.Subpt=C1.Part)

SELECT * FROM Comp C2

13
Skip the stuff on Magic Sets
That’s Datalog
It’s simple
It’s based on logic
It’s easy to see the join patterns
(especially with anonymous variables)

14

```
To top