Embed
Email

Relational Terminology

Document Sample

Shared by: peng xuebo
Categories
Tags
Stats
views:
0
posted:
10/20/2011
language:
English
pages:
51
Normal









Normal Forms



“Why be normal?”

- Author unknown









Copyright, Harris Corporation & Ophir Frieder, 1998 1

Objectives



• To define first, second, third, and Boyce-Codd normal forms.



• To discuss the motivation for normal forms, and the implications for

database design.









Copyright, Harris Corporation & Ophir Frieder, 1998 2

Normal Forms



• A relational scheme is said to be in first normal form (1NF) if and

only if each of it’s domains contains only scalar values.



– No “repeating groups” or unbounded lists.

– A field cannot itself be a table, as in Oracle8!





Question: Do CITY and STATE violate 1NF in relation scheme

R=(CITY,STATE,POPULATION)?









Copyright, Harris Corporation & Ophir Frieder, 1998 3

Motivation For 1NF



• Representation - with a repeating group, some method must be

devised for specifying the end or length of the list.

• Space Allocation - how is space allocated on a per record basis if

repeating groups are allowed?



SS# NAME HOBBIES

032446254 Mary Smith jogging, wrestling

045242453 Bob Jones cooking, cycling,

gardening

932415223 Sue Clark quilting,

hiking,

...









Copyright, Harris Corporation & Ophir Frieder, 1998 4

Motivation For 1NF



• Operations - without 1NF, all operations become more complex, and

this propagates throughout the database management system.

• Theory - 1NF simplifies the theoretical basis of the relational model

(e.g., proof of algorithmic correctness).



SS# NAME HOBBIES

032446254 Mary Smith jogging, wrestling

045242453 Bob Jones cooking, cycling,

gardening

932415223 Sue Clark quilting,

hiking,

...







Copyright, Harris Corporation & Ophir Frieder, 1998 5

Converting A Relational

Scheme To 1NF

• A repeating group is typically eliminated by “flattening” the table.

This makes most things simpler.



SS# NAME HOBBIES

032446254 Mary Smith jogging

032446254 Mary Smith wrestling

045242453 Bob Jones cooking

045242453 Bob Jones cycling

045242453 Bob Jones gardening

:







Copyright, Harris Corporation & Ophir Frieder, 1998 6

Single v.s. Multiple Keys



• For the sake of simplicity we will assume initially that each relational

scheme has exactly one key.



• Multiple keys do occur, but less frequently.



• In what follows, the formal definitions do not change in the case of

multiple keys.



• Given a relational scheme R, it will be helpful throughout the

following to divide the attributes up into two sets, those that are part of

the key, and those that are not.









Copyright, Harris Corporation & Ophir Frieder, 1998 7

Example #1:

Key And Non-Key Attributes

• Consider the following relational scheme for a department store chain

(e.g., Walmart):

– Attributes:

STORE_ID# - A store identification number.

CITY - The city in which the store is located.

STATE - The state in which the store is located.

ITEM - An item sold by the store.

PRICE - The price of the item.



– Functional Dependencies:

STORE_ID# => CITY

STORE_ID# => STATE

STORE_ID#, ITEM => PRICE





Copyright, Harris Corporation & Ophir Frieder, 1998 8

Example #1, Cont.





STORE_ID# CITY STATE ITEM PRICE

W001 Orlando FL Duck Tape 3.95

W001 Orlando FL Rope 3.75

W002 Savannah GA Paint 8.75

W002 Savannah GA Plywood 12.50

W002 Savannah GA Insulation 9.95









Copyright, Harris Corporation & Ophir Frieder, 1998 9

Example #1, Cont.



• The only key is: STORE_ID#,ITEM



Key Attributes Non-Key Attributes

STORE_ID# CITY

ITEM STATE

PRICE









Copyright, Harris Corporation & Ophir Frieder, 1998 10

Partial Dependency



• A functional dependency X=>A is a partial dependency if:

– X is a proper subset of the key attributes, and

– A is a non-key attribute.









Copyright, Harris Corporation & Ophir Frieder, 1998 11

Example #1, Cont.



Key Attributes Non-Key Attributes

STORE_ID# CITY

ITEM STATE

PRICE



• STORE_ID# => CITY is a partial dependency.



• Similarly, STORE_ID# => STATE is a partial dependency.









Copyright, Harris Corporation & Ophir Frieder, 1998 12

Normal Forms - 2NF



• A relational scheme is said to be in second normal form (2NF) if and

only if it is in 1NF and contains no partial dependencies.



Question: Why eliminate partial dependencies?









Copyright, Harris Corporation & Ophir Frieder, 1998 13

Example #1, Cont.



• Note the redundancy in the following legal relation:



STORE_ID# CITY STATE ITEM PRICE

W001 Orlando FL Duck Tape 3.95

W002 Charleston SC Duck Tape 3.75

W003 Savannah GA Paint 8.75

W003 Savannah GA Plywood 12.50

W003 Savannah GA Paint 9.95









Copyright, Harris Corporation & Ophir Frieder, 1998 14

Motivation for 2NF

Example #2

• Consider the following relational scheme for maintaining information

associated with students at a university:



– Attributes:

STUDENT_ID# - The social security number of a student.

NAME - The students last name.

COURSE_ID# - The ID # of a course the student is registered in.

DEPT_ID# - The ID # of the department that offers the course.



– Functional Dependencies:

STUDENT_ID# => NAME

COURSE_ID# => DEPT_ID#









Copyright, Harris Corporation & Ophir Frieder, 1998 15

Example #2, Cont.



STUDENT_ID# NAME COURSE_ID# DEPT_ID#

032-34-3455 Smith CS1001 8353

032-34-3455 Smith MS4025 9255

892-64-8372 Jones MS4025 9255

892-64-8372 Jones MS4145 9255

892-64-8372 Jones MS5023 9255









Copyright, Harris Corporation & Ophir Frieder, 1998 16

Example #2, Cont.



• The only key is: STUDENT_ID#, COURSE_ID#



Key Attributes Non-Key Attributes

STUDENT_ID# NAME

COURSE_ID# DEPT_ID#



• STUDENT_ID# => NAME is a partial dependency, so it is not in 2NF.



• Similarly, COURSE_ID# => DEPT_ID# is a partial dependency.









Copyright, Harris Corporation & Ophir Frieder, 1998 17

Example #2, Cont.



• Note the redundancy in the following legal relation:



STUDENT_ID# NAME COURSE_ID# DEPT_ID#

032-34-3455 Smith CS1001 8353

032-34-3455 Smith MS4025 9255

892-64-8372 Jones MS4025 9255

892-64-8372 Jones MS4145 9255

892-64-8372 Jones MS5023 9255









Copyright, Harris Corporation & Ophir Frieder, 1998 18

Anomalies Resulting From

Partial Dependencies



• Insertion Anomalies - A new student cannot be added unless they are

currently registered for at least one course.



• Deletion Anomalies - If a student drops their last, or only course, then

there is no record left of the student.



• Update Anomalies - Changing a students’ name requires all their

records to be updated. Similarly, for changing a course ID #, or for

assigning a course to a different department.



Note that the first two assume null values are not desirable, which is of

considerable debate in the database community.



Copyright, Harris Corporation & Ophir Frieder, 1998 19

Transitive Dependency



• A functional dependency X=>A is a transitive dependency if:

– X is a proper subset of the non-key attributes, and

– A is a non-key attribute.









Copyright, Harris Corporation & Ophir Frieder, 1998 20

Motivation For 3NF

Example #1



• Consider the following relational scheme for National Football League

(NFL) athletes:



– Attributes:

PLAYER_ID# - The social security number for an NFL athlete.

TEAM - The name of the team the athlete plays for.

STATE - The state in which the team is located.



– Functional Dependencies:

PLAYER_ID# => TEAM

TEAM => STATE









Copyright, Harris Corporation & Ophir Frieder, 1998 21

Example #1, Cont.



PLAYER_ID# TEAM STATE

024-34-3455 Bills NY

195-63-6775 Dolphins FL

892-64-8372 Vikings MN

934-84-7271 Vikings MN

783-83-9879 Vikings MN









Copyright, Harris Corporation & Ophir Frieder, 1998 22

Example #1, Cont.



Key Attributes Non-Key Attributes

PLAYER_ID# TEAM STATE







• TEAM => STATE is a transitive dependency.









Copyright, Harris Corporation & Ophir Frieder, 1998 23

Normal Forms - 3NF



• A relational scheme is said to be in third normal form (3NF) if and

only if it is in 2NF and contains no transitive dependencies.



Every non-key attribute depends on the key, the whole key and nothing

but the key.



Question: Why eliminate transitive dependencies?









Copyright, Harris Corporation & Ophir Frieder, 1998 24

Motivation For 3NF, Cont.



Key Attributes Non-Key Attributes

PLAYER_ID# TEAM

STATE



• There are no partial dependencies, so it is in 2NF.



• However, it is not in 3NF because of the transitive dependency

TEAM=>STATE.









Copyright, Harris Corporation & Ophir Frieder, 1998 25

Example #1, Cont.



• Note the redundancy in the following legal relation:



PLAYER_ID# TEAM STATE

024-34-3455 Bills NY

195-63-6775 Dolphins FL

892-64-8372 Vikings MN

934-84-7271 Vikings MN

783-83-9879 Vikings MN









Copyright, Harris Corporation & Ophir Frieder, 1998 26

Motivation for 3NF

Example #2

• Consider the following relational scheme for a university student

database:

– Attributes:

STUDENT_ID# - A students’ social security number.

CITY - The city of the students’ home address.

STATE - The state of the students’ home address.

ZIP - The zip code of the students’ home address.

– Functional Dependencies:

STUDENT_ID# => CITY

STUDENT_ID# => STATE

STUDENT_ID# => ZIP

ZIP => STATE

ZIP => CITY



Copyright, Harris Corporation & Ophir Frieder, 1998 27

Example #2, Cont.



STUDENT_ID# CITY STATE ZIP

024-34-3455 Melbourne FL 32934

195-63-6775 Melbourne FL 32934

892-64-8372 Orlando FL 32816

934-84-7271 Orlando FL 32816

783-83-9879 Oswego NY 13126









Copyright, Harris Corporation & Ophir Frieder, 1998 28

Example #2, Cont.



Key Attributes Non-Key Attributes

STUDENT_ID CITY

STATE

ZIP



• There are no partial dependencies, so it is in 2NF.



• ZIP => STATE and ZIP=>CITY are transitive dependencies, so it is

not in 3NF.







Copyright, Harris Corporation & Ophir Frieder, 1998 29

Example #2, Cont.



• Note the redundancy in the following legal relation:



STUDENT_ID# CITY STATE ZIP

024-34-3455 Melbourne FL 32934

195-63-6775 Melbourne FL 32934

892-64-8372 Orlando FL 32816

934-84-7271 Orlando FL 32816

783-83-9879 Oswego NY 13126









Copyright, Harris Corporation & Ophir Frieder, 1998 30

Motivation for 3NF

Example #3

• Consider a relational scheme for tracking software licenses:

– Attributes:

LICENSE_ID# - The license ID number for a piece of software.

MACHINE_ID# - The ID number of the machine on which the

software is installed.

EMPLOYEE_ID# - The social security number of the employee to

which the machine is assigned.

LOCATION - The location of the employee’s office.



– Functional Dependencies:

LICENSE_ID# => MACHINE_ID#

MACHINE_ID# => EMPLOYEE_ID#

EMPLOYEE_ID# => LOCATION





Copyright, Harris Corporation & Ophir Frieder, 1998 31

Example #3, Cont.



LICENSE_ID# MACHINE_ID# EMPLOYEE_ID# LOCATION

3243452 3357 023-45-7886 B1-234

4432451 5342 094-34-0033 B3-9867

4452312 4256 143-95-7453 B2-6774

5343231 4256 143-95-7453 B2-6774

5534242 1124 143-95-7453 B2-6774









Copyright, Harris Corporation & Ophir Frieder, 1998 32

Example #3, Cont.



Key Attributes Non-Key Attributes

LICENSE_ID# MACHINE_ID#

EMPLOYEE_ID#

LOCATION



• There are no partial dependencies, so the relational scheme is in 2NF.



• MACHINE_ID# => EMPLOYEE_ID# and EMPLOYEE_ID# =>

LOCATION are both transitive dependencies, so it is not in 3NF.







Copyright, Harris Corporation & Ophir Frieder, 1998 33

Example #3, Cont.



• Note the redundancy in the following legal relation:



LICENSE_ID# MACHINE_ID# EMPLOYEE_ID# LOCATION

3243452 3357 023-45-7886 B1-234

4432451 5342 094-34-0033 B3-9867

4452312 4256 143-95-7453 B2-6774

5343231 4256 143-95-7453 B2-6774

6634324 1124 143-95-7453 B2-6774

7534242 1124 143-95-7453 B2-6774









Copyright, Harris Corporation & Ophir Frieder, 1998 34

Anomalies Resulting From

Transitive Dependencies

• Insertion Anomalies - A license cannot be added until it is installed

on a machine, and until that machine is assigned to an employee.



• Deletion Anomalies - Deleting all of the records for a particular

employee would delete any record of the machines or licenses assigned

to that employee.



• Update Anomalies - Changing the employee assigned to a particular

machine requires multiple record updates. Similarly for changing an

employees’ location.



As with partial dependencies, the first two assume null values are not

desirable.



Copyright, Harris Corporation & Ophir Frieder, 1998 35

Normal Forms - BCNF



• A relation is said to be in Boyce/Codd normal form (BCNF) if every

attribute depends on the key, the whole key, and nothing but the key.









Copyright, Harris Corporation & Ophir Frieder, 1998 36

BCNF

Example #1

• Consider a relational scheme for tracking employee salary

adjustments:

– Attributes:

EMPLOYEE_ID# - An employee identification number

DATE - A date on which the employee’s salary was adjusted

AMOUNT - The amount of the salary adjustment

EXPLANATION - An explanation for the adjustment



– Functional Dependencies:

EMPLOYEE_ID#,DATE => AMOUNT

EMPLOYEE_ID#,DATE => EXPLANATION









Copyright, Harris Corporation & Ophir Frieder, 1998 37

Example #3, Cont.



Key Attributes Non-Key Attributes

EMPLOYEE_ID# AMOUNT

DATE EXPLANATION



• There are no partial dependencies, so the relational scheme is in 2NF.



• There are no transitive dependencies, so the relational scheme is in

3NF.

• Each of the non-key attributes depends on both of the key attributes

(the key, the whole key, and nothing but the key), so the relational

scheme is in BCNF.





Copyright, Harris Corporation & Ophir Frieder, 1998 38

Normal Forms - BCNF, Cont.



• Note that the definition of BCNF does not reference that for 3NF.



• This raises a couple of questions:

– If a relational scheme is in BCNF, is it also in 3NF?

– If a relational scheme is in 3NF, is it also in BCNF?



• The answer to the first is yes (proof is left as an exercise).



• The answer to the second question depends...









Copyright, Harris Corporation & Ophir Frieder, 1998 39

Normal Forms - BCNF, Cont.



• Relational Scheme Has Only One Key:

– A relational scheme is in 3NF if and only if it is in BCNF.



• Relational Scheme Has Multiple Keys:

– If the relational scheme is in BCNF, then it is in 3NF (already stated).

– If the relational scheme is in 3NF, however, it is not necessarily in BCNF.



• If the case of multiple keys, 1NF, 2NF, and 3NF definitions are still

the same.









Copyright, Harris Corporation & Ophir Frieder, 1998 40

Motivation For BCNF

Example #1

• Consider the following relational scheme:

– Attributes:

STUDENT_ID# - A student ID number.

COURSE_ID# - The ID# of a course being taken by the student.

FACULTY_ID# - The ID# of the faculty member who

teaches the course taken by the student.



– Functional Dependencies:

STUDENT_ID#,COURSE_ID# => FACULTY_ID#

FACULTY_ID# => COURSE_ID#









Copyright, Harris Corporation & Ophir Frieder, 1998 41

Example #1, Cont.



STUDENT_ID# COURSE_ID# FACULTY_ID#

023-45-7886 CSE1001 998-32-4124

094-34-0033 CSE4257 087-32-8797

143-95-7453 CSE2130 774-92-3889

034-42-8009 CSE2130 774-92-3889









Copyright, Harris Corporation & Ophir Frieder, 1998 42

Example #1, Cont.



• There are two keys:

STUDENT_ID#,COURSE_ID#

STUDENT_ID#,FACULTY_ID#





• The relation is in 1NF, 2NF, and 3NF (why?)



• The relation is not in BCNF because of the dependency:

FACULTY_ID# => COURSE_ID#









Copyright, Harris Corporation & Ophir Frieder, 1998 43

Example #1, Cont.



• Note the redundancy in the following legal relation:





STUDENT_ID# COURSE_ID# FACULTY_ID#

023-45-7886 CSE1001 998-32-4124

094-34-0033 CSE4257 087-32-8797

143-95-7453 CSE2130 774-92-3889

034-42-8009 CSE2130 774-92-3889









Copyright, Harris Corporation & Ophir Frieder, 1998 44

Motivation For BCNF

Example #2

• Consider the following relational scheme:

– Attributes:

LICENSE_ID# - A Florida state driver’s license number.

SS# - The social security number of the person holding the

license.

CODE - A traffic violation code.

QTY - The number of times the person has been issued the

violation within the past year.



– Functional Dependencies:

LICENSE_ID# => SS# LICENSE_ID#,CODE => QTY

SS# => LICENSE_ID# SS#,CODE => QTY









Copyright, Harris Corporation & Ophir Frieder, 1998 45

Example #2, Cont.





LICENSE_ID# SS# CODE QTY

B424-341-52-415 023-45-7886 Speeding 3

C893-402-52-435 094-34-0033 Tail-Gating 4

M345-310-61-875 143-95-7453 Speeding 1

M345-310-61-875 143-95-7453 Double-Parking 6









Copyright, Harris Corporation & Ophir Frieder, 1998 46

Example #2, Cont.



• There are two keys:

LICENSE_ID#,CODE

SS#,CODE



• The relation is in 1NF, 2NF, and 3NF (why?)



• The relation is not in BCNF because of the dependencies:

LICENSE_ID# => SS#

SS# => LICENSE_ID#









Copyright, Harris Corporation & Ophir Frieder, 1998 47

Example #2, Cont.



• Note the redundancy in the following legal relation:





LICENSE_ID# SS# CODE QTY

B424-341-52-415 023-45-7886 Speeding 3

C893-402-52-435 094-34-0033 Tail-Gating 4

M345-310-61-875 143-95-7453 Speeding 1

M345-310-61-875 143-95-7453 Double-Parking 6









Copyright, Harris Corporation & Ophir Frieder, 1998 48

Update Anomolies



• Insertion Anomalies - The fact that a license ID# has been assigned to

a particular person cannot be recorded unless they have at least one

violation.



• Deletion Anomalies - Deleting all of the violations for a particular

driver would delete any record of the license ID# for that person.



• Update Anomalies - Changing a driver’s name requires changing all

the records for each type of violation the driver has committed.



As with partial dependencies, the first two assume null values are not

desirable.





Copyright, Harris Corporation & Ophir Frieder, 1998 49

Normal Forms Summary



• A relational scheme is said to be in first normal form (1NF) if and

only if each of it’s domains contains only scalar values.



• A relational scheme is said to be in second normal form (2NF) if and

only if it is in 1NF and contains no partial dependencies.



• A relational scheme is said to be in third normal form (3NF) if and

only if it is in 2NF and contains no transitive dependencies.



• A relational scheme is said to be in Boyce/Codd normal form

(BCNF) if and only if the only nontrivial dependencies for the

relational scheme are those in which a key functionally determines one

or more attributes (“every attribute depends on the key, the whole key,

and nothing but the key”).

Copyright, Harris Corporation & Ophir Frieder, 1998 50

Normal Forms - 4NF & 5NF



Currently Beyond The Scope Of This Course:



• A relational scheme R is said to be in fourth normal form (4NF) if

and only if whenever there is a multivalued dependency X=>>Y,

where Y is not empty or a subset of X, and XY does not include all the

attributes of R, then X is a superkey of R.



• A relational scheme R is said to be in fifth normal form (5NF) - also

called projection-join normal form (PJ/NF) - if and only if every join

dependency in R is implied by the candidate keys of R.









Copyright, Harris Corporation & Ophir Frieder, 1998 51



Related docs
Other docs by peng xuebo
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!