VIEWS: 11 PAGES: 38 POSTED ON: 7/31/2012
The Relational Algebra Dr. Yousry Taha IS230 (This is the instructor’s notes and student has to read the textbook for complete material.) Basic Relational Algebra Operations A data model must include a set of operations to manipulate the data. A basic set of relational model operations constitute the Relational Algebra. These operations enable the user to specify basic retrieval requests which is also a relation. A relational algebra expression is a sequence of relational algebra operations. IS 230 (Relational Algebra) - Dr. Yousry Taha 2 Basic Relational Algebra Operations (cont) The relational algebra operations are divided into two groups: Set operations (∪, ∩ , – , ) Operations developed specifically for the relational DB (Select () , Project (P), Join ( ) and others). • Some additional operations were suggested to perform some common DB requests (aggregations (), ….). IS 230 (Relational Algebra) - Dr. Yousry Taha 3 The SELECT Operation () It is used to select a subset from R that satisfies a selection condition. The select operation is denoted by: <selection condition> (R) The <selection condition> is on the form: <attribute name> <comparison op> <constant> or <attribute name> <comparison op> <attribute name> Where <comparison op> { , , , , , } • We can use the logical operations (AND, OR, NOT) to form the relational algebra expressions. IS 230 (Relational Algebra) - Dr. Yousry Taha 4 The SELECT Operation () (cont) Ex: Select all employees who work on department 4. DNO = 4 (EMPLOYEE) Ex: Select all employees who work on department 4 and their salaries are more than 25000 or who work on department 5 and their salaries are more than 30000. (DNO = 4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000) (EMPLOYEE) IS 230 (Relational Algebra) - Dr. Yousry Taha 5 The SELECT Operation () (cont) The select operation is unary operation.. deg(c (R)) = deg(R), and all attributes of the resulting relation are of the same names and domains as in R. | c (R) | | R | Selectivity Ratio = | c (R) | | R | The select operation is commutative: c1 (c2 (R)) = c2 (c1 (R)) c1 (c2 (… (cn(R)) …)) = c1 AND c2 AND … AND cn (R) IS 230 (Relational Algebra) - Dr. Yousry Taha 6 The PROJECT Operation (P) It is used to select certain attributes from the relation and discard the other attributes. The project operation is denoted by: P<attribute list> (R) deg(P<list> (R)) = the number of attributes in <list>. If the <list> includes only nonkey attributes of R, duplicate tuples are likely to occur (duplicate elimination), and the project operation removes any duplicate tuples. | P<list> (R) | | R | IS 230 (Relational Algebra) - Dr. Yousry Taha 7 The PROJECT Operation (P) (cont) P<list1> (P<list2> (R)) = P<list1> (R) as long as <list2> contains the attributes in <list1>. The project operation is not commutative. Ex: List name and salary for each employee. PLNAME,FNAME,SALARY (EMPLOYEE) Ex: List sex and salary for each employee. PSEX,SALARY (EMPLOYEE) IS 230 (Relational Algebra) - Dr. Yousry Taha 8 Sequences of Operations Relational algebra expression can contain any number of operations by any sequence to from the required result. The expression can be formulated by nesting the operations in one expression or creating an intermediate relations. Ex: P ( FNAME,LNAME,SALARY(EMPLOYEE)) DNO = 5 Can be rewritten as follows: DEPT5_EMPS ← DNO=5 (EMPLOYEE) RESULT ← P FNAME,LNAME,SALARY (DEPT5_EMPS) IS 230 (Relational Algebra) - Dr. Yousry Taha 9 Rename Operation Rename operation is used to rename the relation or the attributes. It is very useful in formulating the relational algebra expressions and is necessary for some operations. The general Rename operation is denoted by: rS(B1,B2,..BN)(R) » R is the original relation name and Ai is the original attribute names. » S is the new relation name and Bi is the new attribute names. rS(R) » This operation renames the relation only. r (B1,B2,..BN)(R) » This operation renames the attributes only. IS 230 (Relational Algebra) - Dr. Yousry Taha 10 Set Theoretic Operations Several Set Operations re used to merge the elements of two relations in many ways (Union, Intersection, Difference, Cartesian Product [∪, ∩, –, ). The two relations around the operation must be union compatible (they must have the same type of tuples). Formally, if the two relations are R(A1,A2,..,An) and S(B1,B2,…,Bn) then: dom(Ai) = dom(Bi) for 1 i n. The two relations must have the same number of attributes and each corresponding pair must be of the same domain. The resulting relation relation has the same attribute names as the first relation. IS 230 (Relational Algebra) - Dr. Yousry Taha 11 Set Theoretic Operations (cont) Union: R ∪ S, is a relation that includes all tuples that are either in R or in S or in both (duplicates are eliminated). Intersection: R ∩ S, is a relation that includes all tuples that are in both R and S. Set Difference: R – S, is a relation that includes all tuples that are in R but not in S. Union and Intersection are commutative and associative. R ∪ S ≡ S ∪ R and R∪(S∪T) ≡ (R∪S)∪T R ∩ S ≡ S ∩ R and R ∩(S ∩ T) ≡ (R ∩ S) ∩ T R – S ≠ S – R (Difference is not commutative). IS 230 (Relational Algebra) - Dr. Yousry Taha 12 Set Operations (Example) STUDENT FN LN Susan Yao Ramesh Shah FN LN Johnny Kohler Susan Yao Barbara Jones Ramesh Shah Amy Ford Johnny Kohler Jimmy Wang Barbara Jones Emest Gilbert Amy Ford Two union compatible relations Student ∪ Instructor Jimmy Wang Emest Gilbert John Smith INSTRUCTOR FNAME LNAME Ricardo Browne John Smith Francis Johnson Ricardo Browne Susan Yao Francis Johnson Ramesh Shah IS 230 (Relational Algebra) - Dr. Yousry Taha 13 Set Operations (Example) STUDENT FN LN Susan Yao Ramesh Shah Johnny Kohler Barbara Jones Amy Ford Jimmy Wang Emest Gilbert FN LN Two union compatible relations Student ∩ Instructor Susan Yao Ramesh Shah INSTRUCTOR FNAME LNAME John Smith Ricardo Browne Susan Yao Francis Johnson Ramesh Shah IS 230 (Relational Algebra) - Dr. Yousry Taha 14 Set Operations (Example) STUDENT FN LN FN LN Susan Yao Johnny Kohler Ramesh Shah Barbara Jones Johnny Kohler Amy Ford Barbara Jones STUDENT – INSTRUCTOR Jimmy Wang Amy Ford Emest Gilbert Jimmy Wang Emest Gilbert Two union compatible relations INSTRUCTOR FNAME LNAME FNAME LNAME John Smith John Smith Ricardo Browne INSTRUCTOR – STUDENT Ricardo Browne Susan Yao Francis Johnson Francis Johnson Ramesh Shah IS 230 (Relational Algebra) - Dr. Yousry Taha 15 Set Theoretic Operations (cont) Cartesian Product () It is called also Cross Product or Cross Join. It is denoted by: R S. It is a binary operation and R and S do not have to be union compatible. If Q = R(A1,A2,..,An) S(B1,B2,…,Bm) then: Q has n+m attributes which are A1,A2,..,An,B1,B2,…,Bm. |Q| = |R| * |S| IS 230 (Relational Algebra) - Dr. Yousry Taha 16 Cartesian Product (cont) Q has one tuple for each combination of tuples, one from R and one from S. The operation applied by itself is generally meaningless unless it is followed by a selection. It is common to use Select operation with Cartesian Product, so a special operation (Join) was developed to specify this sequence in a single operation. IS 230 (Relational Algebra) - Dr. Yousry Taha 17 Cartesian Product (cont) Ex: Retrieve for each female employee a list of her dependents. R1 ← SEX = “F” (EMPLOYEE) R2 ← P FNAME,LNAME,SSN (R1) R3 ← R2 DEPENDENT R4 ← SSN = ESSN (R3) RESULT ← P FNAME,LNAME,DEPENDENT_NAME (R4) IS 230 (Relational Algebra) - Dr. Yousry Taha 18 Join Operation ( ) It is used to combine related tuples from two relations into single relation. It is a binary operation of the form R <join condition> S. If Q = R <c> S, then deg(Q) = deg(R) + deg(S). Q has only the combination that satisfies the join condition. The general form of the condition is: <cond1> AND <cond2> AND ……AND <cond n>. IS 230 (Relational Algebra) - Dr. Yousry Taha 19 Join Operation (cont) Each condition is of the form Ai Bj ,where Ai and Bj are attributes and { , , , , , }. This general form of the Join called THETA JOIN. The most common form is the Equijoin where is . Ex: Retrieve manager’s name of each department. DEPT_MGR ← (DEPARTMENT MGRSSN =SSN EMPLOYEE) RESULT ← P DNAME,LNAME,FNAME (DEPT_MGR ) IS 230 (Relational Algebra) - Dr. Yousry Taha 20 Join Operation (cont) Natural Join (*) was developed to get rid of the second (superfluous) attribute in the Equijoin condition. The two join attributes in Natural Join should have the same name in both relations. Ex1: RESEARCH_DEPT ← PROJECT * r (DNAME,DNUM,MGRSSN,MGRSTARTDATE)(DEPARTMENT) Ex2: DEPT_LOCS ← DEPARTMENT * DEPT_LOCATIONS IS 230 (Relational Algebra) - Dr. Yousry Taha 21 A Complete Set of Operations The complete set of relation algebra is: { , P ,∪, – , } So, other operations can be expressed as a sequence of operations from the complete set. Ex: R ∩ S ≡ (R∪S) – ((R – S) ∪ (S – R)) R C S ≡ C (R S) IS 230 (Relational Algebra) - Dr. Yousry Taha 22 The Division Operation It is a binary operation denoted by ÷ . It is very useful for a special kind of query as: “Retrieve the name of employees who work on all the projects that ‘John Smith’ works on. R1 ← FNAME = “John” AND LNAME = “Smith” (EMPLOYEE) R2 ← P PNO(WORKS_ON ESSN = SSN R1) R3 ← P ESSN,PNO (WORKS_ON) R4(SSN) ← R3 ÷ R2 RESULT ← P FNAME,LNAME (R4 * EMPLOYEE) IS 230 (Relational Algebra) - Dr. Yousry Taha 23 The Division Operation (cont) The general form of the operation is: T(Y) = R(Z) ÷ S(X) where: X ⊆ Z and Y = Z – X and for each tuple t in T(Y): tR[Y] = t and tR[X] = tS for every tuple tS in S. This means that, for a tuple t to appear in the result T of the Division, the values in t must appear in R in combination with every tuple in S. IS 230 (Relational Algebra) - Dr. Yousry Taha 24 The Division Operation (cont) The Division operator can be expressed as a sequence of P , , – as follows: T(Y) = R(Z) ÷ S(X) is equivalent to: T1 ← P Y (R) T2 ← P Y (( S T1 ) – R) T ← T1 – T2 IS 230 (Relational Algebra) - Dr. Yousry Taha 25 The Division Operation (cont) EX R A B S A a1 b1 a1 a2 b1 a2 a3 b1 a3 a4 b1 a1 b2 T = R ÷ S a3 b2 a2 b3 a3 b3 T B a4 b3 b1 a1 b4 b4 a2 b4 a3 b4 IS 230 (Relational Algebra) - Dr. Yousry Taha 26 Additional Relation Operations Aggregation Functions: – It contains common functions as SUM, AVERAGE, MAXIMUM, MINIMUM, COUNT. – Grouping can be used to group tuples by the values of some attributes. – The general form is : <grouping attribute> <function list> (R) – Ex: DNO COUNT SSN, AVERAGE SALARY (EMPLOYEE) DNO COUNT_SSN AVERAGE_SALARY 5 4 33250 4 3 31000 1 1 55000 IS 230 (Relational Algebra) - Dr. Yousry Taha 27 Additional Relation Operations (cont) Recursive Closure Operations: – This operation is applied to a recursive relationship between tuples of the same type. – EX: Retrieve SSNs of all employees directly supervised by “James Borg”. R1 ← P SSN ( FNAME = “James” AND LNAME=“Borg” (EMPLOYEE)) R2(SSN1,SSN2) ← P SSN,SUPERSSN (EMPLOYEE) R3(SSN) ← P SSN1 (R2 SSN2=SSN R1) Then retrieve all employees supervised by Borg at level 2. R4(SSN) ← P SSN1 (R2 SSN2=SSN R3) RESULT ← R4 ∪ R3 IS 230 (Relational Algebra) - Dr. Yousry Taha 28 Additional Relation Operations (cont) Outer Join: – In Join operation R C S, tuples without a matching tuple are eliminated from the join result. – Also, tuples with null in the join attributes are also eliminated. – When we want to keep all tuples in R or those in S, or those in both relations in the result of Join operation, we can use Right Outer Join ( ), Left Outer Join ( ), or Full Outer Join ( ). IS 230 (Relational Algebra) - Dr. Yousry Taha 29 Additional Relation Operations (cont) Outer Union: – It was developed to take the Union of tuples from two relations which are not Union compatible (partially compatible). – It is expected that the list of compatible attributes includes a key for both relations. – The result contains the compatible attributes and all that are not Union compatible. – Null values are used to pad the non-compatible attributes in tuples. IS 230 (Relational Algebra) - Dr. Yousry Taha 30 Additional Relation Operations (cont) Outer Union: – Ex: If we have the following relations: STUDENT (Name, SSN, Dept, Advisor) FACULTY (Name, SSN, Dept, Rank) If, R = STUDENT OUTER UNION FACULTY Then the resulting relation is: R (Name, SSN, Dept, Advisor, Rank) with null in the attribute Rank for tuples from STUDENT and null in the attribute Advisor for the tuples from FACULTY. IS 230 (Relational Algebra) - Dr. Yousry Taha 31 Examples of Queries in Relational Algebra Query 1: Retrieve the name and address of all employees who work for the “Research” department. R1 ← DNAME = “Research” (DEPARTMENT) R2 ← (R1 DNUMBER = DNO EMPLOYEE) RESULT ← P FNAME,LNAME,ADDRESS (R2) (This Query could be in other order, e.g., the order of JOIN and SELECT could be reversed, or the JOIN could be replaced by NATURAL JOIN with renaming.) IS 230 (Relational Algebra) - Dr. Yousry Taha 32 Examples of Queries in Relational Algebra (cont) Query 2: For every project located in “Stafford”, list the project number, the controlling department number, and the department manager’s last name, address, and birth date. R1 ← PLOCATION = “Stafford” (PROJECT) R2 ← (R1 DNUM = DNUMBER DEPARTMENT) R3 ← (R2 MGRSSN = SSN EMPLOYEE) RESULT ← P PNUMBER,DNUM,LNAME,ADDRESS,BDATE (R3) IS 230 (Relational Algebra) - Dr. Yousry Taha 33 Examples of Queries in Relational Algebra (cont) Query 3: Find the names of employees who work on all projects controlled by department number 5. R1(PNO) ← P PNUMBER ( DNUM = 5 (PROJECT)) R2(SSN,PNO) ← P ESSN,PNO (WORKS_ON ) R3 ← R2 ÷ R1 RESULT ← P LNAME,FNAME (R3 * EMPLOYEE) IS 230 (Relational Algebra) - Dr. Yousry Taha 34 Examples of Queries in Relational Algebra (cont) Query 4: Make a list of project numbers for projects that involve an employee whose last name is “Smith”, either as a worker or as a manager of the department that controls the project. R1(ESSN) ← P SSN ( LNAME = “Smith” (EMPLOYEE)) R2 ← P PNO (WORKS_ON * R1 ) R3 ← P LNAME,DNUMBER ( EMPLOYEE SSN = MGRSSN DEPARTMENT) R4(DNUM) ← P DNUMBER ( LNAME = “Smith” (R3)) R5(PNO) ← P PNUMBER (R4 * PROJECT) RESULT ← (R2 ∪ R5) IS 230 (Relational Algebra) - Dr. Yousry Taha 35 Examples of Queries in Relational Algebra (cont) Query 5: List the names of all employees with two or more dependents. R1(SSN,NO_OF_DEPTS) ← ESSN COUNT DEPENDENT_NAME (DEPENDENT) R2 ← NO_OF_DEPTS ≥ 2 (R1) RESULT ← P LNAME,FNAME (R2 * EMPLOYEE) IS 230 (Relational Algebra) - Dr. Yousry Taha 36 Examples of Queries in Relational Algebra (cont) Query 6: Retrieve the names of employees who have no dependents. R1 ← P SSN (EMPLOYEE) R2(SSN) ← P ESSN (DEPENDENT) R3 ← (R1 – R2) RESULT ← P LNAME,FNAME (R3 * EMPLOYEE) IS 230 (Relational Algebra) - Dr. Yousry Taha 37 Examples of Queries in Relational Algebra (cont) Query 7: List the names of managers have at least one dependent. R1(SSN) ← P MGRSSN (DEPARTMENT) R2(SSN) ← P ESSN (DEPENDENT) R3 ← (R1 ∩ R2) RESULT ← P LNAME,FNAME (R3 * EMPLOYEE) IS 230 (Relational Algebra) - Dr. Yousry Taha 38