# Relational Algebra and Calculus Lecture

Document Sample

```					       Relational Calculus

Zachary G. Ives
University of Pennsylvania
CIS 550 – Database & Information Systems

September 17, 2007
Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan
 Reminder: Homework 1 due 9/26 (next Wednesday)

 Change to office hours due to conflict:
 Mondays 3:30-4:30 instead of Wednesdays

2
A Set of Logical Operations: The
Relational Algebra
 Six basic operations:
   Projection      (R)
   Selection       (R)
   Union          R1 [ R2      STUDENT
   Difference     R1 – R2
   Product        R1 £ R2
   (Rename)       ->b (R)           Takes     COURSE
 And some other useful ones:
 Join             R1 ⋈  R2
 Intersection     R1 Å R2                 Calculus

SELECT *
FROM STUDENT, Takes, COURSE
WHERE STUDENT.sid = Takes.sID
AND Takes.cID = cid
3
Our Example Data Instance
STUDENT             Takes                               COURSE
sid   name          sid    exp-grade         cid           cid     subj   sem
1   Jill          1            A     550-0105         550-0105   DB     F05
2   Qun           1            A     700-1005         700-1005   AI     S05
3   Nitin         3            C     501-0105         501-0105   Arch   F05

PROFESSOR                Teaches
fid         name         fid   cid
1           Ives         1     550-0105
2           Saul         2     700-1005
8           Roth         8     501-0105

4
Some Examples
 Faculty ids

 Subjects for courses with students expecting a “C”

 All course numbers for which there exists a smaller
course number

5
Domain Relational Calculus
Queries have form:
domain variables
{<x1,x2, …, xn>| p}
predicate
Predicate: Boolean expression over x1,x2, …, xn
 Precise operations depend on the domain and query
language – may include special functions, etc.
 Assume the following at minimum:
<xi,xj,…>  R   X op Y    X op const   const op X

where op is , , >, , , 
xi,xj,… are domain variables
6
Complex Predicates in the Calculus
Starting with these atomic predicates, build up new
predicates by the following rules:
 Logical connectives: If p and q are predicates, then so are
p  q, p  q, p, and p  q
 (x>2)  (x<4)
 (x>2)  (x>0)
 Existential quantification: If p is a predicate, then so is
x.p
 x. (x>2) (x<4)
 Universal quantification: If p is a predicate, then so is x.p
 x.x>2
 x. y.y>x
7
Some Examples
 Faculty ids

 Subjects for courses with students expecting a “C”

 All course numbers for which there exists a smaller
course number

8
Logical Equivalences
 There are two logical equivalences that will be
heavily used:
 p  q  p  q
(Whenever p is true, q must also be true.)
 x. p(x)  x. p(x)
(p is true for all x)
 The second can be a lot easier to check!

 Example:
 The highest course number offered

9
Terminology:
Free and Bound Variables
 A variable v is bound in a predicate p when p is of
the form v… or v…
 A variable occurs free in p if it occurs in a position
where it is not bound by an enclosing  or 
 Examples:
 x is free in x > 2
 x is bound in x. x > y

10
Can Rename Bound Variables Only
 When a variable is bound one can replace it with
some other variable without altering the
meaning of the expression, providing there are
no name clashes
 Example: x. x > 2 is equivalent to y. y > 2

 Otherwise, the variable is defined outside our
“scope”…

11
Safety
 Pitfall in what we have done so far – how do we interpret:
{<sid,name>| <sid,name>  STUDENT}

 Set of all binary tuples that are not students: an infinite set (and
unsafe query)

 A query is safe if no matter how we instantiate the
relations, it always produces a finite answer
 Domain independent: answer is the same regardless of the
domain in which it is evaluated
 Unfortunately, both this definition of safety and domain
independence are semantic conditions, and are undecidable

12
Safety and Termination Guarantees

 There are syntactic conditions that are used to
guarantee “safe” formulas
 The definition is complicated, and we won’t discuss it; you
can find it in Ullman’s Principles of Database and Knowledge-
Base Systems
 The formulas that are expressible in real query languages
based on relational calculus are all “safe”
 Many DB languages include additional features, like
recursion, that must be restricted in certain ways to

13
Mini-Quiz
How do you write:
 Which students have taken more than one course from
the same professor?

14
Translating from RA to DRC
 Core of relational algebra: , , , x, -
 We need to work our way through the structure of
an RA expression, translating each possible form.
 Let TR[e] be the translation of RA expression e into
DRC.

 Relation names: For the RA expression R, the DRC
expression is
{<x1,x2, …, xn>| <x1,x2, …, xn>  R}

15
Selection: TR[ R]
 Suppose we have (e’), where e’ is another RA
expression that translates as:
TR[e’]= {<x1,x2, …, xn>| p}
 Then the translation of c(e’) is
{<x1,x2, …, xn>| p’}
where ’ is obtained from  by replacing each
attribute with the corresponding variable
 Example: TR[#1=#2 #4>2.5R] (if R has arity 4) is
{<x1,x2, x3, x4>|
< x1,x2, x3, x4>  R  x1=x2  x4>2.5}

16
Projection: TR[i1,…,im(e)]
 If TR[e]= {<x1,x2, …, xn>| p} then
TR[i1,i2,…,im(e)]=
{<x i1,x i2, …, x im >|  xj1,xj2, …, xjk.p},
where xj1,xj2, …, xjk are variables in x1,x2, …, xn
that are not in x i1,x i2, …, x im

 Example: With R as before,
#1,#3 (R)={<x1,x3>| x2,x4. <x1,x2, x3,x4> R}

17
Union: TR[R1  R2]
 R1 and R2 must have the same arity
 For e1  e2, where e1, e2 are algebra expressions
TR[e1]={<x1,…,xn>|p} and TR[e2]={<y1,…yn>|q}
 Relabel the variables in the second:
TR[e2]={< x1,…,xn>|q’}
 This may involve relabeling bound variables in q to
avoid clashes
TR[e1e2]={<x1,…,xn>|pq’}.
 Example: TR[R1  R2] = {< x1,x2, x3,x4>|
<x1,x2, x3,x4>R1  <x1,x2, x3,x4>R2

18
Other Binary Operators
 Difference: The same conditions hold as for union
If TR[e1]={<x1,…,xn>|p} and TR[e2]={< x1,…,xn>|q}
Then TR[e1- e2]= {<x1,…,xn>|pq}
 Product:
If TR[e1]={<x1,…,xn>|p} and TR[e2]={< y1,…,ym>|q}
Then TR[e1 e2]= {<x1,…,xn, y1,…,ym >| pq}

 Example:
TR[RS]= {<x1,…,xn, y1,…,ym >|
<x1,…,xn> R  <y1,…,ym > S }

19
Calculus?

 We’ve been looking at the Domain Relational
Calculus
 The Tuple Relational Calculus is nearly the same, but
variables are at the level of a tuple, not an attribute

 {Q | 9 S  COURSES, 9 T 2 Takes (S.cid = T.cid Æ Q.cid

20
Limitations of the
Relational Algebra / Calculus
Can’t do:
 Aggregate operations
 Recursive queries
 Complex (non-tabular) structures

 Most of these are expressible in SQL, OQL, XQuery
– using other special operators
 Sometimes we even need the power of a Turing-
complete programming language

21
Summary
 Can translate relational algebra into relational calculus
 DRC and TRC are slightly different syntaxes but equivalent
 Given syntactic restrictions that guarantee safety of DRC
query, can translate back to relational algebra

 These are the principles behind initial development of
relational databases
 SQL is close to calculus; query plan is close to algebra
 Great example of theory leading to practice!

22

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 22 posted: 5/25/2012 language: English pages: 22