By Albert Lin
Process of efficiently organizing data
in a database.
Eliminate redundant data
Ensure data dependency sensibility
Faster sorting and index creation
Larger clustered indexes
Narrow and compact indexes
Fewer indexes per table.
Fewer null values
Brief Description of Normal
1st Normal Form
No repeating groups
2nd Normal Form
Each column depends entirely on the primary key.
3rd Normal Form
Each column depends directly on the primary key.
4th Normal Form
Isolate independent multiple relationships.
5th Normal Form
Isolate Semantically related multiple relationships.
Boyce-Codd Normal Form
Non Trivial dependencies between candidate keys.
Optimal Normal Form
Limited to simple Facts
Domain-Key Normal Form
No modification anomolies
1st Normal Form
Addresses the structure of an isolated
2nd, 3rd and Boyce Codd Normal Form
Addresses 1-1 and 1-many relationships.
4th and 5th Normal Form
These forms are cumulative.
Duplication of data enforced
impossible due to entity integrity
Leads to undesired loss of data.
Modification of multiple rows can lead to
Databases that are sufficiently normalized
can reduce anomalies from occurring.
Make a table for each list.
Use non-meaningful primary keys
Eliminate Repeating Groups
Eliminate Columns not dependent on
Each table should have an identifier
Should only store data for a single
type of entity.
Avoid null columns
Records are free, new fields are
Know when data requires duplication
Use referential integrity
The higher normal form generally
results in faster data retrieval.
Questions to ask
What data do you need?
What are you going to do with the
How are the data related?
What is the future of the data?
Examples of Bad Databases
SID Class Time Location Prof_ID
0015 CS166 10:30 MH226 34215
1205 CS146 12:00 MH225 25123
0025 CS166 1:30 MH222 34215
0015 CS151 8:00 MH222 24215
2531 CS156 9:00 MH422 25124
1205 CS120 10:30 DH251 12412
2522 CS046 8:00 MH224 34215
1523 CS140 5:00 MH422 25123
0015 CS140 3:00 MH422 34215 10
Problems with the database
In a large scale database, information
would be repeated numerous times,
resulting in redundant data
Retrieval of data would be difficult
Index creation would be difficult.
Many databases are “de-normalized” to
This is due to performance issues.
It may require fewer joins and result in
However, before doing “de-normalization”
performance issues must exist and de-
normalization must dramatically improve it
before introducing a suboptimal design
A de-normalized table can be harder to
Normalization helps organization and
speed of organizing a database,
which can help a company produce a
database system that is quick and
The higher the normal form, the less
chance anomalies will arise.
In case of performance issues, de-
normalization can be done in order to
increase performance. 13
Strive for Single Themed Tables.
Litt, Steve. “Normalization”. 1996
“Rules of Data Normalization”. 2005.
Chung,Luke. “Database Normalization Tips”. FMS. 2001.
Janert, Phillip. “Practical Database”. IEEE. 2004.
Reus, Bernhard. “Databases”. University of Sussex. 2004.
Wyllys, R.E. “Steps in Normalization” University of Texas at Austin.