VIEWS: 78 PAGES: 56 POSTED ON: 7/27/2012 Public Domain
RELATIONAL ALGEBRA Prof. Sin-Min LEE Department of Computer Science Database Scheme A relational database scheme, or schema, corresponds to a set of table definitions. Eg: product(p_id, name, category, description) supply(p_id, s_id, qnty_per_month) supplier(s_id, name, address, ph#) * remember the difference between a DB instance and a DB scheme. SAMPLE SCHEMAS AND INSTANCES The Schemas: Sailors(sid: integer, sname: string, rating: integer, age: real) Boats(bid: integer, bname: string, color: string) Reserves(sid: integer, bid: integer, day: date) The Instances: What is Relational Algebra? Relational algebra is a procedural query language. It consists of the select, project, union, set difference, Cartesian product, and rename operations. Set intersection, division, natural join, and assignment combine the fundamental operations. SQL is based on relational algebra What are the query languages ? It is an abstract language. We use it to express the set of operations that any relational query language must perform. Two types of operations: 1.set-theoretic operations: tables are essentially sets of rows 2.native relational operations: focus on the structure of the rows Query languages are specialized languages for asking questions,or queries,that involve the data in database. Query languages procedural vs. non-procedural commercial languages have some of both we will study: relational algebra (which is procedural, i.e. tells you how to process a query) relational calculus (which is non-procedural i.e. tells what you want) SEMANTICS OF THE SAMPLE RELATIONS Sailors: Entity set; lists the relevant properties of sailors. Boats: Entity set; lists the relevant properties of boats. Reserves: Relationship set: links sailors and boats by describing the boat number and date for which a sailor made a reservation. Example of the declarative sentences for which rows stand: Row 1: “Sailor ’22’ reserved boat number ‘101’ on 10/10/98”. Selection and Projection Selection Operator: σrating>8 (S2) condition Retrieves from the current instance of relation named S2 those rows where the value of the attribute ‘rating’ is greater than 8. Applying the above selection operator to the sample instance of S2 shown in figure 4.2 yields the relational instance on figure 4.4 as shown below: π Projection Operator πsname,rating(S2) Retrieves from the current instance of the relation named S2 those columns whose names are ‘sname’ and ‘rating’. Applying the above operator to the sample instance of S2 shown in figure 4.2 yields the relational instance on figure 4.5 as shown below: N. B.: Note that the projection operator can produce duplicate rows in the resulting instance. - Projection Operator (cont’d) Similarly πage(S2) yields the following relational instance Note here the elimination of duplicates SQL would yield age For πage (S2): 35.0 55.0 35.0 35.0 Introduction one of the two formal query languages of the relational model collection of operators for manipulating relations Operators: two types of operators Set Operators: Union(),Intersection(), Difference(-), Cartesian Product (x) New Operators: Select (), Project (), Join (⋈) Introduction – cont’d A Relational Algebra Expression: a sequence of relational algebra operators and operands (relations), formed according to a set of rules. The result of evaluating a relational algebra expression is a relation. Selection Denoted by c(R) Selects the tuples (rows) from a relation R that satisfy a certain selection condition c. It is a unary operator The resulting relation has the same attributes as those in R. Example 1: S: SNO SNAME AGE STATE S1 MIKE 21 IL S2 STEVE 20 LA S3 MARY 18 CA S4 MING 19 NY S5 OLGA 21 NY state=‘IL’(S) Example 2: CREDIT 3(C) C: CNO CNAME CREDIT DEPT Databas C1 3 CS e Statistic C2 3 MATH s C3 Tennis 1 SPORTS C4 Violin 4 MUSIC C5 Golf 2 SPORTS C6 Piano 5 MUSIC Example 3 E: SNO CNO Grade SNO=‘S1’and CNO=‘C1’(E) S1 C1 90 S1 C2 80 S1 C3 75 S1 C4 70 S1 C5 100 S1 C6 60 S2 C1 90 S2 C2 80 S3 C2 90 S4 C2 80 S4 C4 85 S4 C5 100 Selection - Properties Selection Operator is commutative C1(C2 (R)) = C2(C1 (R)) The Selection is an unary operator, it cannot be used to select tuples from more than one relations. Projection Denoted by L(R), where L is list of attribute names and R is a relation name or some other relational algebra expression. The resulting relation has only those attributes of R specified in L. The projection is also an unary operation. Duplicate rows are not permitted in relational algebra. Duplication is removed from the result. Duplicate rows can occur in SQL, though they may be controlled by explicit keywords. Projection - Example Example 1: STATE (S) SNO SNAME AGE STATE STATE S1 MIKE 21 IL IL S2 STEVE 20 LA LA S3 MARY 18 CA CA S4 MING 19 NY NY S5 OLGA 21 NY Projection - Example Example 2: CNAME, DEPT(C) CNO CNAME CREDIT DEPT CNAME DEPT Databas Databas C1 3 CS CS e e Statistic Statistic C2 3 MATH MATH s s C3 Tennis 1 SPORTS Tennis SPORTS C4 Violin 4 MUSIC Violin MUSIC C5 Golf 2 SPORTS Golf SPORTS C6 Piano 5 MUSIC Piano MUSIC Projection - Example Example 3: S#(STATE=‘NY'(S)) SNO SNAME AGE STATE S1 MIKE 21 IL SNO S2 STEVE 20 LA S3 MARY 18 CA S4 S4 MING 19 NY S5 S5 OLGA 21 NY SET Operations UNION: R1 R2 INTERSECTION: R1 R2 DIFFERENCE: R1 - R2 CARTESIAN PRODUCT: R1 R2 Union Compatibility For operators , , -, the operand relations R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) must have the same number of attributes, and the domains of the corresponding attributes must be compatible; that is, dom(Ai)=dom(Bi) for i=1,2,...,n. The resulting relation for , , or - has the same attribute names as the first operand relation R1 (by convention). Union Compatibility - Examples Are S(SNO, SNAME, AGE, STATE) and C(CNO, CNAME, CREDIT, DEPT) union compatible? Are S(S#, SNAME, AGE, STATE) and C(CNO, CNAME, CREDIT_HOURS, DEPT_NAME) union compatible? FUNCTIONAL APPLICATION OF SELECTION AND PROJECTION OPERATORS The selection and projection operators can be applied successively as many times as desired in the usual functional denotation as illustrated below. Thus the expression πsname,rating(σrating>8(S2)) yields the following relational instance SET OPERATIONS The following set operations are also available in relational algebra: Union* Intersection* Set-difference* Cross-product N.B. (1) The asterisks indicate operations whose operand relations must be union-compatible. Two relation instances are said to be union- compatible if: - they have the same number of fields, - corresponding fields have the same domains. - they have the same semantics. (2) The results of set operations on sets and multisets are different, therefore we shall examine both of these separately. EXAMPLES OF SET OPERATIONS ON RELATIONS Union: Given the sample instances S1 and S2 The union of S1 and S2, i.e. S1 ∪ S2 is shown below Given the two sample relational instances We can form: Intersection: S1∩S2 Set-Difference: S1 – S2 Given the two relational samples S1 and S2 We can form the Cross-product S1 R1: # of col. = (# col. S1) +( # col. R1) # of rows = (# rows S1)(# rowsR1) SPECIAL RELATIONAL OPERATORS The following operators are peculiar to relations: - Join operators There are several kind of join operators. We only consider three of these here (others will be considered when we discuss null values): - (1) Condition Joins - (2) Equijoins - (3) Natural Joins - Division JOIN OPERATORS Condition Joins: - Defined as a cross-product followed by a selection: R ⋈c S = σc(R S) (⋈ is called the bow-tie) where c is the condition. - Example: Given the sample relational instances S1 and R1 The condition join S ⋈S1.sid<R1.sid R1 yields Equijoin: Special case of the condition join where the join condition consists solely of equalities between two fields in R and S connected by the logical AND operator (∧). Example: Given the two sample relational instances S1 and R1 The operator S1 R.sid=Ssid R1 yields Natural Join - Special case of equijoin where equalities are implicitly specified on all fields having the same name in R and S. - The condition c is now left out, so that the “bow tie” operator by itself signifies a natural join. - N. B. If the two relations have no attributes in common, the natural join is simply the cross-product. DIVISION - The division operator is used for queries which involve the ‘all’ qualifier such as “Find the names of sailors who have reserved all boats”. - The division operator is a bit tricky to explain, and perhaps best approached through examples as will be done here. EXAMPLES OF DIVISION DIVISION Interpretation of the division operation A/B: - Divide the attributes of A into 2 sets: A1 and A2. - Divide the attributes of B into 2 sets: B2 and B3. - Where the sets A2 and B2 have the same attributes. - For each set of values in B2: - Search in A2 for the sets of rows (having the same A1 values) whose A2 values (taken together) form a set which is the same as the set of B2’s. - For all the set of rows in A which satisfy the above search, pick out their A1 values and put them in the answer. DIVISION Example: Find the names of sailors who have reserved all boats: (1) A = sid,bid(Reserves). A1 = sid(Reserves) A2 = bid(Reserves) (2) B2 = bid(Boats) B3 is the rest of B. Thus, B2 ={101, 102, 103, 104} (3) Find the rows of A such that their A.sid is the same and their combined A.bid is the set B2. Thus we find A1 = {22} (4) Get the set of A2 corresponding to A1: A2 = {Dustin} FORMAL DEFINITION OF DIVISION The formal definition of division is as follows: A/B = x(A) - x((x(A) B) – A) EXAMPLES OF ALGEBRA QUERIES In the rest of this chapter we shall illustrate queries using the following new instances S3 of sailors, R2 of Reserves and B1 of boats. QUERY Q1 Given the relational instances: (Q1) Find the names of sailors who have reserved boat 103 sname((σbid=103 Reserves) ⋈ Sailors) The answer is thus the following relational instance {<Dustin>, <Lubber>, <Horatio>} QUERY Q1 (cont’d) There are of course several ways to express Q1 in relational algebra. Here is another: sname(σbid=103(Reserves⋈ Sailors)) Which of these expressions should we use? That is a question of optimization. Indeed, when we describe how to state queries in SQL, we can leave it to the optimizer in the DBMS to select the nest approach. QUERY Q2 (Q2) Find the names of sailors who have reserved a red boat. sname((σcolor=‘red’Boats) ⋈ Reserves ⋈ Sailors) QUERY Q3 (Q3) Find the colors of boats reserved by Lubber. color((σsname=‘Lubber’Sailors)Sailors ⋈ Reserves ⋈ Boats) QUERY Q4 (Q4) Find the names of Sailors who have reserved at least one boat sname(Sailors ⋈ Reserves) QUERY Q5 (Q5) Find the names of sailors who have reserved a red or a green boat. (Tempboats, (σcolor=‘red’Boats) ∪ (σcolor=‘green’Boats)) sname(Tempboats ⋈ Reserves ⋈ Sailors) QUERY Q6 (Q6) Find the names of Sailors who have reserved a red and a green boat. It seems tempting to use the expression used in Q5, replacing simply ∪ by ∩. However, this won’t work, for such an expression is requesting the names of sailors who have requested a boat that is both red and green! The correct expression is as follows: (Tempred, sid((σcolor=‘red’Boats) ⋈ Reserves)) (Tempgreen, sid((σcolor=‘green’Boats) ⋈ Reserves)) sname ((Tempred ∩ Tempgreen) ⋈ Sailors) QUERY Q7 (Q7) Find the names of sailors who have reserved at least two boats. (Reservations, sid,sname,bid(Sailors ⋈ Reserves)) (Reservationpairs(1sid1, 2sname, 3bid1, 4sid2, 5sname, 6bid2), ReservationsReservations) sname1σ(sid1=sid2)(bid1bid2)Reservationpairs) QUERY 8 (Q8) Find the sids of sailors with age over 20 who have not reserved a red boat. sid(σage>20Sailors) - sid((σcolor=‘red’Boats) ⋈ Reserves ⋈ Sailors) QUERY 9 (Q) Find the names of sailors who have reserved all boats. (Tempsids, (sid,bidReserves) / (bidBoats)) sname(Tempsids ⋈ Sailors QUERY Q10 (Q10) Find the names of sailors who have reserved all boats called Interlake. (Tempsids, (sid,bidReserves)/(bid(σbname=‘Interlake’Boats))) sname(Tempsids ⋈ Sailors) Union, Intersection, Difference T= R U S : A tuple t is in relation T if and only if t is in relation R or t is in relation S T = R S: A tuple t is in relation T if and only if t is in both relations R and S T= R - S :A tuple t is in relation T if and only if t is in R but not in S Set-Intersection Denoted by the symbol . Results in a relation that contains only the tuples that appear in both relations. R S = R – (R – S) Since set-intersection can be written in terms of set-difference, it is not a fundamental operation. Examples R S A1 A2 B1 B2 1 Red 3 White 3 White 2 Blue 4 green Examples RS R S A1 A2 1 Red A1 A2 3 White 3 White 4 Green R-S 2 Blue S-R A1 A2 B1 B2 1 Red 2 Blue 4 Green