VIEWS: 6 PAGES: 94 POSTED ON: 6/26/2012
Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Design of Relational Database Schemas T. M. Murali October 27, November 1, 2010 T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Plan Till Thanksgiving What are the typical problems or “anomalies” in relational designs? Introduce the idea of decomposing a relation schema into two smaller schemas. Introduce “Boyce-Codd normal form” (BCNF), a condition on relational schemas that eliminates anomalies. BCNF stated using the concept of FDs. Use decomposition of schemas to bring them to BCNF. Deﬁne another type of constraint called Multivalued Dependencies (MDs). Deﬁne normal forms that eliminate MDs. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Closures of FDs Given a relation R and a set F of FDs that hold in R the closure {F}+ is the set of all FDs that follow from R. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Closures of FDs Given a relation R and a set F of FDs that hold in R the closure {F}+ is the set of all FDs that follow from R. Recall: An FD S follows from a set of FDs T if every relation instance that satisﬁes all the FDs in T also satisﬁes S. S = {A → C } follows from T = {A → B, B → C }. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Computing Closures of FDs To compute the closure of a set of FDs, repeatedly apply Armstrong’s Axioms until you cannot ﬁnd any new FDs: T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Computing Closures of FDs To compute the closure of a set of FDs, repeatedly apply Armstrong’s Axioms until you cannot ﬁnd any new FDs: Reﬂexivity: If Y ⊆ X , then X → Y Augmentation: If X → Y then XZ → YZ for any attribute Z . Transitivity: If X → Y and Y → Z then X → Z . T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A → B, B → C }. {F}+ is T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A → B, B → C }. {F}+ is {A → B, B → C , A → C , AC → B, AB → C } T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A → B, B → C }. {F}+ is {A → B, B → C , A → C , AC → B, AB → C } F = {AB → C , BC → A, AC → B}. {F}+ is T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A → B, B → C }. {F}+ is {A → B, B → C , A → C , AC → B, AB → C } F = {AB → C , BC → A, AC → B}. {F}+ is {AB → C , BC → A, AC → B} T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A → B, B → C }. {F}+ is {A → B, B → C , A → C , AC → B, AB → C } F = {AB → C , BC → A, AC → B}. {F}+ is {AB → C , BC → A, AC → B} F = {A → B, B → C , C → D}. {F}+ is T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A → B, B → C }. {F}+ is {A → B, B → C , A → C , AC → B, AB → C } F = {AB → C , BC → A, AC → B}. {F}+ is {AB → C , BC → A, AC → B} F = {A → B, B → C , C → D}. {F}+ is {A → B, B → C , C → D, A → C , A → D, B → D, . . .} T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Closures of FDs vs. Closures of Attributes Both algorithms take as input a relation R and a set of FDs F. Closure of FDs: + Computes {F} , the set of all FDs that follow from F. Output is a set of FDs. Output may contain an exponential number of FDs. Closure of attributes: In addition, takes a set {A1 , A2 , . . . , An } of attributes as input. + Computes {A1 , A2 , . . . , An } , the set of all attributes B such that the A1 A2 . . . An → B follows from F. Output is a set of attributes. Output may contain at most the number of attributes in R. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Projecting Sets of FDs Suppose we have a relation R and a set of FDs F. Let S be a relation obtained by projecting R into a subset of the attributes of R, i.e., S = πAttributes (R). The projection FS of F is the set of FDs that follow from F and hold in S (involve only attributes of S). Algorithm for computing FS : + 1. Compute {F} . + 2. FS is the set of all FDs in {F} that involve only the attributes in S. Book describes a diﬀerent algorithm on page 82 (Chapter 3.2.8). Book’s algorithm also shows how to compute a minimal basis of FS . T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Projecting Sets of FDs Suppose we have a relation R and a set of FDs F. Let S be a relation obtained by projecting R into a subset of the attributes of R, i.e., S = πAttributes (R). The projection FS of F is the set of FDs that follow from F and hold in S (involve only attributes of S). Algorithm for computing FS : + 1. Compute {F} . + 2. FS is the set of all FDs in {F} that involve only the attributes in S. Book describes a diﬀerent algorithm on page 82 (Chapter 3.2.8). Book’s algorithm also shows how to compute a minimal basis of FS . R(A, B, C , D), F = {A → B, B → C , C → D}. Which FDs hold in S(A, C , D)? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Projecting Sets of FDs Suppose we have a relation R and a set of FDs F. Let S be a relation obtained by projecting R into a subset of the attributes of R, i.e., S = πAttributes (R). The projection FS of F is the set of FDs that follow from F and hold in S (involve only attributes of S). Algorithm for computing FS : + 1. Compute {F} . + 2. FS is the set of all FDs in {F} that involve only the attributes in S. Book describes a diﬀerent algorithm on page 82 (Chapter 3.2.8). Book’s algorithm also shows how to compute a minimal basis of FS . R(A, B, C , D), F = {A → B, B → C , C → D}. Which FDs hold in S(A, C , D)? {F}+ is T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Projecting Sets of FDs Suppose we have a relation R and a set of FDs F. Let S be a relation obtained by projecting R into a subset of the attributes of R, i.e., S = πAttributes (R). The projection FS of F is the set of FDs that follow from F and hold in S (involve only attributes of S). Algorithm for computing FS : + 1. Compute {F} . + 2. FS is the set of all FDs in {F} that involve only the attributes in S. Book describes a diﬀerent algorithm on page 82 (Chapter 3.2.8). Book’s algorithm also shows how to compute a minimal basis of FS . R(A, B, C , D), F = {A → B, B → C , C → D}. Which FDs hold in S(A, C , D)? {F}+ is {A → B, B → C , C → D, A → C , A → D, B → D} T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Projecting Sets of FDs Suppose we have a relation R and a set of FDs F. Let S be a relation obtained by projecting R into a subset of the attributes of R, i.e., S = πAttributes (R). The projection FS of F is the set of FDs that follow from F and hold in S (involve only attributes of S). Algorithm for computing FS : + 1. Compute {F} . + 2. FS is the set of all FDs in {F} that involve only the attributes in S. Book describes a diﬀerent algorithm on page 82 (Chapter 3.2.8). Book’s algorithm also shows how to compute a minimal basis of FS . R(A, B, C , D), F = {A → B, B → C , C → D}. Which FDs hold in S(A, C , D)? {F}+ is {A → B, B → C , C → D, A → C , A → D, B → D} FS is {C → D, A → C , A → D}. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Design of Relational Database Schemas Careless design of relational schemas can cause problems. Example: Combining the relation for a many-many relationship with the relation for one of its entity sets causes redundancy. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Design of Relational Database Schemas Careless design of relational schemas can cause problems. Example: Combining the relation for a many-many relationship with the relation for one of its entity sets causes redundancy. Suppose we combine the schemas Courses(Number, DepartmentName, CourseName, Classroom, Enrollment) and Take(StudentName, Address, Number, DepartmentName) into one relation Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Anomalies in Relational Schemas Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Anomalies in Relational Schemas Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) Redundancy: information is repeated unnecessarily in several tuples. Update anomalies: We change information in one tuple but leave the old information in another tuple. Insertion anomalies: It is not possible to store some information unless some other, unrelated information is stored as well. Deletion anomalies: If a set of values becomes empty, we may lose other information as a side eﬀect. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Decomposing Relations Accepted way to eliminate anomalies is to “decompose” relations. Given a relation R(A1 , A2 , . . . , An ), two relations S(B1 , B2 , . . . , Bm ) and T (C1 , C2 , . . . , Ck ) form a decomposition of R if 1. the attributes of S and T together make up the attributes of R, i.e., {A1 , A2 , . . . , An } = {B1 , B2 , . . . , Bm } ∪ {C1 , C2 , . . . , Ck }. 2. the tuples in S are the projections into {B1 , B2 , . . . , Bm } of the tuples in R, i.e. S ≡ πB1 ,B2 ,...,Bm (R). 3. the tuples in T are the projections into {C1 , C2 , . . . , Ck } of the tuples in R, i.e., T ≡ πC1 ,C2 ,...,Ck (R). T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Example of Decomposition Decompose Courses into Courses1(Number, DepartmentName, CourseName, Classroom, Enrollment) and Courses2(Number, DepartmentName, StudentName, Address). Are the anomalies removed? Redundancy Update Insertion Deletion T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Boyce-Codd Normal Form Condition on the FDs in a relation that guarantees that anomalies do not exist. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Boyce-Codd Normal Form Condition on the FDs in a relation that guarantees that anomalies do not exist. A relation R is in Boyce-Codd Normal Form (BCNF) if and only if for every non-trivial FD A1 A2 . . . An → B for R, {A1 , A2 , . . . , An } is a superkey for R. Informally, the left side of every non-trivial FD must be a superkey. A relation R violates BCNF if it has an FD such that the attributes of the left side of an FD do not form a superkey. In other words, there is some key of R such that only some (not all) of the attributes in this key appear on the left side of the FD. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to ﬁrst ﬁnd all the keys! T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to ﬁrst ﬁnd all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to ﬁrst ﬁnd all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to ﬁrst ﬁnd all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are Number DepartmentName → CourseName Number DepartmentName → Classroom Number DepartmentName → Enrollment T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to ﬁrst ﬁnd all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are Number DepartmentName → CourseName Number DepartmentName → Classroom Number DepartmentName → Enrollment What is {Number, DepartmentName}+ ? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to ﬁrst ﬁnd all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are Number DepartmentName → CourseName Number DepartmentName → Classroom Number DepartmentName → Enrollment What is {Number, DepartmentName}+ ? {Number, DepartmentName, Coursename, Classroom, Enrollment}. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to ﬁrst ﬁnd all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are Number DepartmentName → CourseName Number DepartmentName → Classroom Number DepartmentName → Enrollment What is {Number, DepartmentName}+ ? {Number, DepartmentName, Coursename, Classroom, Enrollment}. Therefore, the key is {Number, DepartmentName, StudentName, Address}. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to ﬁrst ﬁnd all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are Number DepartmentName → CourseName Number DepartmentName → Classroom Number DepartmentName → Enrollment What is {Number, DepartmentName}+ ? {Number, DepartmentName, Coursename, Classroom, Enrollment}. Therefore, the key is {Number, DepartmentName, StudentName, Address}. The relation is not in BCNF. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Decomposition into BCNF Suppose R is a relation schema that violates BCNF. We can decompose R into a set S of new relations such that 1. each relation in S is in BCNF and 2. we can “recover” R from the relations in S, i.e., we can reconstruct R exactly from the relations in S. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions BCNF Normalisation Algorithm Let A be the set of all attributes of R. Let F be the set of all FDs of R. Suppose the FD X1 X2 . . . Xm → Y violates BCNF. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions BCNF Normalisation Algorithm Let A be the set of all attributes of R. Let F be the set of all FDs of R. Suppose the FD X1 X2 . . . Xm → Y violates BCNF. 1. Compute {X1 X2 . . . , Xm }+ . T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions BCNF Normalisation Algorithm Let A be the set of all attributes of R. Let F be the set of all FDs of R. Suppose the FD X1 X2 . . . Xm → Y violates BCNF. 1. Compute {X1 X2 . . . , Xm }+ . 2. Decompose R into two relations R1 and R2 with schemas R1 : all the attributes in {X1 , X2 . . . , Xm }+ R2 : all the attributes on the left side of the violating FD and all the attributes of R not in {X1 , X2 , . . . , Xm }+ , i.e., A − {X1 , X2 . . . , Xm }+ ∪ {X1 , X2 . . . , Xm }. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions BCNF Normalisation Algorithm Let A be the set of all attributes of R. Let F be the set of all FDs of R. Suppose the FD X1 X2 . . . Xm → Y violates BCNF. 1. Compute {X1 X2 . . . , Xm }+ . 2. Decompose R into two relations R1 and R2 with schemas R1 : all the attributes in {X1 , X2 . . . , Xm }+ R2 : all the attributes on the left side of the violating FD and all the attributes of R not in {X1 , X2 , . . . , Xm }+ , i.e., A − {X1 , X2 . . . , Xm }+ ∪ {X1 , X2 . . . , Xm }. 3. Find FDs in R1 and R2 and decompose them if they are not in BCNF 4. For i = 1, 2 Compute FRi the projection of the FDs in F into Ri . If any of the FDs in FRi violates BCNF, decompose Ri recursively. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Decomposing Courses Schema is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). BCNF-violating FD is Number DepartmentName → CourseName Classroom Enrollment. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Decomposing Courses Schema is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). BCNF-violating FD is Number DepartmentName → CourseName Classroom Enrollment. What is {Number, DepartmentName}+ ? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Decomposing Courses Schema is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). BCNF-violating FD is Number DepartmentName → CourseName Classroom Enrollment. What is {Number, DepartmentName}+ ? {Number, DepartmentName, CourseName, Classroom, Enrollment T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Decomposing Courses Schema is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). BCNF-violating FD is Number DepartmentName → CourseName Classroom Enrollment. What is {Number, DepartmentName}+ ? {Number, DepartmentName, CourseName, Classroom, Enrollment Decompose Courses into Courses1(Number, DepartmentName, CourseName, Classroom, Enrollment) and Courses2(Number, DepartmentName, StudentName, Address). T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Decomposing Courses Schema is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). BCNF-violating FD is Number DepartmentName → CourseName Classroom Enrollment. What is {Number, DepartmentName}+ ? {Number, DepartmentName, CourseName, Classroom, Enrollment Decompose Courses into Courses1(Number, DepartmentName, CourseName, Classroom, Enrollment) and Courses2(Number, DepartmentName, StudentName, Address). Are there any BCNF violations in the two new relations? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID → Name FavouriteAdvisorId AdvisorId → AdvisorName T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID → Name FavouriteAdvisorId AdvisorId → AdvisorName What is the key? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID → Name FavouriteAdvisorId AdvisorId → AdvisorName What is the key? {ID, AdvisorId}. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID → Name FavouriteAdvisorId AdvisorId → AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID → Name FavouriteAdvisorId AdvisorId → AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID → Name FavouriteAdvisorId AdvisorId → AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. Use ID → Name FavouriteAdvisorId to decompose. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID → Name FavouriteAdvisorId AdvisorId → AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. Use ID → Name FavouriteAdvisorId to decompose. + {ID} is T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID → Name FavouriteAdvisorId AdvisorId → AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. Use ID → Name FavouriteAdvisorId to decompose. + {ID} is {ID, Name, FavouriteAdvisorId}. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID → Name FavouriteAdvisorId AdvisorId → AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. Use ID → Name FavouriteAdvisorId to decompose. + {ID} is {ID, Name, FavouriteAdvisorId}. Schemas for new relations are T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID → Name FavouriteAdvisorId AdvisorId → AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. Use ID → Name FavouriteAdvisorId to decompose. + {ID} is {ID, Name, FavouriteAdvisorId}. Schemas for new relations are Students1(ID, Name, FavouriteAdvisorId) Students2(ID, AdvisorId, AdvisorName) T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? AdvisorId → AdvisorName Does it violate BCNF? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? AdvisorId → AdvisorName Does it violate BCNF? Yes. Repeat the decomposition process. Use AdvisorId → AdvisorName to decompose. + {AdvisorId} is T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? AdvisorId → AdvisorName Does it violate BCNF? Yes. Repeat the decomposition process. Use AdvisorId → AdvisorName to decompose. + {AdvisorId} is {AdvisorId, AdvisorName}. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? AdvisorId → AdvisorName Does it violate BCNF? Yes. Repeat the decomposition process. Use AdvisorId → AdvisorName to decompose. + {AdvisorId} is {AdvisorId, AdvisorName}. Schemas for new relations are T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? AdvisorId → AdvisorName Does it violate BCNF? Yes. Repeat the decomposition process. Use AdvisorId → AdvisorName to decompose. + {AdvisorId} is {AdvisorId, AdvisorName}. Schemas for new relations are Students2(ID, AdvisorId) Students3(AdvisorId, AdvisorName). T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Examples (Problem 1 of Handout 3) Apply the BCNF normalisation algorithm to Inventory relation. (Problem 2 of Handout 3) Apply the BCNF normalisation algorithm to Concerts relation. For both problems, try all possible choices of FDs to start the normalisation with. Compare the advantages and disadvantages of the choices. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions BCNFs and Two-Attribute Relations True or False: Every two-attribute relation R(A, B) is in BCNF. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions BCNFs and Two-Attribute Relations True or False: Every two-attribute relation R(A, B) is in BCNF. The statement is true. Why? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions BCNFs and Two-Attribute Relations True or False: Every two-attribute relation R(A, B) is in BCNF. The statement is true. Why? Consider four possible cases: 1. There are no non-trivial FDs. 2. A → B is the only non-trivial FD. 3. B → A is the only non-trivial FD. 4. Both A → B and B → A hold in R. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Decomposition into BCNF Suppose R is a relation schema that violates BCNF. We can decompose R into a set S of two or more new relations such that 1. each relation in S is in BCNF and 2. we can “recover” R from the relations in S, i.e., we can reconstruct R from the relations in S. How does the normalisation algorithm guarantee the second condition? In general, what properties does the decomposition satisfy? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Desirable Properties of a Decomposition 1. Eliminate anomalies. 2. Recover the original relation exactly from the relations it is decomposed into. 3. When we reconstruct the original relation, the result will satisfy the original FDs. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Desirable Properties of a Decomposition 1. Eliminate anomalies. 2. Recover the original relation exactly from the relations it is decomposed into. 3. When we reconstruct the original relation, the result will satisfy the original FDs. BCNF decomposition algorithm: T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Desirable Properties of a Decomposition 1. Eliminate anomalies. 2. Recover the original relation exactly from the relations it is decomposed into. 3. When we reconstruct the original relation, the result will satisfy the original FDs. BCNF decomposition algorithm: gives us properties 1 and 2 but not 3. 3NF decomposition algorithm: gives us properties 2 and 3 but not 1. (Discuss in the next class.) T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Candidate Normalisation Algorithm Every two-attribute relation is in BCNF. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Candidate Normalisation Algorithm Every two-attribute relation is in BCNF. Can we bring any relation R into BCNF by arbitrarily decomposing it into two-attribute relations? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Candidate Normalisation Algorithm Every two-attribute relation is in BCNF. Can we bring any relation R into BCNF by arbitrarily decomposing it into two-attribute relations? No, since we may not be able to recover R correctly from the decomposition. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Joining Relations T R S A B C A B B C a b1 c1 a1 b1 b1 c1 = 1 a2 b1 c1 a2 b1 b2 c2 a2 b2 c2 a2 b2 b2 c3 a2 b2 c3 Let R and S be two relations with one common attribute B. Relation T is the natural join of R and S, denoted R S if and only if T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Joining Relations T R S A B C A B B C a b1 c1 a1 b1 b1 c1 = 1 a2 b1 c1 a2 b1 b2 c2 a2 b2 c2 a2 b2 b2 c3 a2 b2 c3 Let R and S be two relations with one common attribute B. Relation T is the natural join of R and S, denoted R S if and only if the attributes of T are the union of the attributes of R and S, T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Joining Relations T R S A B C A B B C a b1 c1 a1 b1 b1 c1 = 1 a2 b1 c1 a2 b1 b2 c2 a2 b2 c2 a2 b2 b2 c3 a2 b2 c3 Let R and S be two relations with one common attribute B. Relation T is the natural join of R and S, denoted R S if and only if the attributes of T are the union of the attributes of R and S, every tuple t ∈ T is the join of two tuples r ∈ R and s ∈ S that agree on the attribute B, i.e., t agrees with r on all the attributes in R and with s on all attributes in S, T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Joining Relations T R S A B C A B B C a b1 c1 a1 b1 b1 c1 = 1 a2 b1 c1 a2 b1 b2 c2 a2 b2 c2 a2 b2 b2 c3 a2 b2 c3 Let R and S be two relations with one common attribute B. Relation T is the natural join of R and S, denoted R S if and only if the attributes of T are the union of the attributes of R and S, every tuple t ∈ T is the join of two tuples r ∈ R and s ∈ S that agree on the attribute B, i.e., t agrees with r on all the attributes in R and with s on all attributes in S, T contains all tuples formed in this manner. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Recovering Information from a Decomposition Suppose R is a relation schema that violates BCNF. The BCNF decomposition algorithm decomposes R into a set {S1 , S2 , . . . Sk } of new relations such that 1. each relation Si , 1 ≤ i ≤ k is in BCNF and 2. the decomposition of R into {S1 , S2 , . . . Sk } is a lossless-join decomposition, i.e., R = S1 S2 . . . Sk . T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Recovering Information from a Decomposition Suppose R is a relation schema that violates BCNF. The BCNF decomposition algorithm decomposes R into a set {S1 , S2 , . . . Sk } of new relations such that 1. each relation Si , 1 ≤ i ≤ k is in BCNF and 2. the decomposition of R into {S1 , S2 , . . . Sk } is a lossless-join decomposition, i.e., R = S1 S2 . . . Sk . 2.1 Every tuple in R is a tuple in S1 S2 . . . Sk . 2.2 Every tuple in S1 S2 . . . Sk is in R. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Example of Lossless-Join Decomposition Relation schema is R(A, B, C ). FD is B → C . T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Example of Lossless-Join Decomposition Relation schema is R(A, B, C ). FD is B → C . Relations in BCNF are S(A, B) and T (B, C ). T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Example of Lossless-Join Decomposition Relation schema is R(A, B, C ). FD is B → C . Relations in BCNF are S(A, B) and T (B, C ). Prove that R = S T 1. Every tuple in R is in S T . 2. Every tuple in S T is in R. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Example of Lossless-Join Decomposition Relation schema is R(A, B, C ). FD is B → C . Relations in BCNF are S(A, B) and T (B, C ). Prove that R = S T 1. Every tuple in R is in S T . 2. Every tuple in S T is in R. What if FD were A → C and we decomposed R into S and T as above? T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Example of Lossless-Join Decomposition Relation schema is R(A, B, C ). FD is B → C . Relations in BCNF are S(A, B) and T (B, C ). Prove that R = S T 1. Every tuple in R is in S T . 2. Every tuple in S T is in R. What if FD were A → C and we decomposed R into S and T as above? S T contains tuples not in R! In general, if R’s attributes are X ∪ Y ∪ Z and Y → Z holds in R, then R = πX ∪Y (R) πY ∪Z (R). T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions The Chase Test for Lossless Join Suppose we have a relation R, a set F of FDs that hold in R, and a decomposition of R into relations S1 , S2 , . . . , Sk . We have “forgotten” how we decomposed R. Is there a way to check R equals the natural join of S1 , S2 , . . . , Sk ? 1. Every tuple in R is a tuple in S1 S2 . . . Sk . 2. Every tuple in S1 S2 . . . Sk is in R. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions The Chase Test for Lossless Join Suppose we have a relation R, a set F of FDs that hold in R, and a decomposition of R into relations S1 , S2 , . . . , Sk . We have “forgotten” how we decomposed R. Is there a way to check R equals the natural join of S1 , S2 , . . . , Sk ? 1. Every tuple in R is a tuple in S1 S2 . . . Sk . 2. Every tuple in S1 S2 . . . Sk is in R. R S1 S2 S3 S1 ⋈S2⋈ S3 A B C D A D A C B C D A B C D T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions The Chase Test for Lossless Join R S1 S2 S3 S1 ⋈S2⋈ S3 A B C D A D A C B C D A B C D t 1. Natural join is associate and commutative: For each tuple t in S1 S2 . . . Sk , projection of t into Si is a tuple in πSi (R), for every 1 ≤ i ≤ k. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions The Chase Test for Lossless Join R S1 S2 S3 S1 ⋈S2⋈ S3 A B C D A D A C B C D A B C D u u 1. Natural join is associate and commutative: For each tuple t in S1 S2 . . . Sk , projection of t into Si is a tuple in πSi (R), for every 1 ≤ i ≤ k. 2. Every tuple u in R is surely in πS1 (R) πS2 (R) ... πSk (R). T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions The Chase Test for Lossless Join R S1 S2 S3 S1 ⋈S2⋈ S3 A B C D A D A C B C D A B C D t t 1. Natural join is associate and commutative: For each tuple t in S1 S2 . . . Sk , projection of t into Si is a tuple in πSi (R), for every 1 ≤ i ≤ k. 2. Every tuple u in R is surely in πS1 (R) πS2 (R) ... πSk (R). 3. How can we show that if the FDs in F hold in R, then every tuple in πS1 (R) πS2 (R) . . . πSk (R) is also a tuple in R? Use the Chase test. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Steps in Chase Test R S1 S2 S3 S1 ⋈S2⋈ S3 A B C D A D A C B C D A B C D t1 t t2 t3 If a tuple t is in πS1 (R) πS2 (R) . . . πSk (R), then there must be tuples t1 , t2 , . . . tk in R such that t is the join of the projections of ti into Si , for every 1 ≤ i ≤ k. each ti agrees with t in the attributes in Si but has unknown values for the attributes not in Si . Using the FDs in F, we want to prove that t must be equal to some ti . T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Steps in Chase Test R S1 S2 S3 S1 ⋈S2⋈ S3 A B C D A D A C B C D A B C D t1 t t2 t3 If a tuple t is in πS1 (R) πS2 (R) . . . πSk (R), then there must be tuples t1 , t2 , . . . tk in R such that t is the join of the projections of ti into Si , for every 1 ≤ i ≤ k. each ti agrees with t in the attributes in Si but has unknown values for the attributes not in Si . Using the FDs in F, we want to prove that t must be equal to some ti . 1. Draw a tableau to indicate which attributes we know the values of in the tuples t1 , t2 , . . . tk . 2. Use FDs to equate unknown attributes of these tuples. 3. When no FD can be applied, check if t is one of the tuples in the tableau. T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database Closing and Projecting FDs Anomalies BCNF Properties of BCNF Decompositions Example of Chase Test Work out following two cases: 1. Decomposition of R(A, B, C , D) into S1 (A, D), S2 (A, C ) and S3 (B, C , D) with FDs A → B, B → C , and CD → A. 2. Same decomposition with FD B → AD Work out examples in Handout 3: 1. (Problem 1, part 6) Apply Chase test to decomposition of Inventory into Inventory1 and Inventory2. 2. (Problem 1, part 7) Modify one of the attributes in either Inventory1 or Inventory2 to obtain a lossless-join decomposition. Verify using the chase test. 3. (Problem 2, part 8(ii)) Apply Chase test to decomposition of Concerts into Concerts1 and Concerts2. 4. Apply Chase test to decomposition of Concerts into Concerts1 and Concerts3(City, Song, Album) and Concerts4(City, Year). T. M. Murali October 27, November 1, 2010 CS 4504: Design of Relational Database