Docstoc

Normalization

Document Sample
Normalization Powered By Docstoc
					Normalization

       Are we Normal?
Normalization
   Normalization is the process of
    converting complex data structures into
    simple, stable data structures
   It also is the process of removing from
    a database certain “anomalies”
Anomalies
   Update anomalies—you have to update a
    record in a number of different places
   Insertion anomalies—Example: in order to
    insert a new employee a project must be
    assigned. If there is no project yet a phantom
    one must be created.
   Deletion anomalies—Two types: when you
    delete a record other vital information is lost,
    or must delete in several places with the
    possibility of leaving unattached data islands
Normal Forms
   There are many Normal Forms—or
    stages of normalization possible, but we
    will only focus on the first three.
 First Normal Form
1.   There are no duplicated rows in the
     table.
2.   Each cell is single-valued (i.e., there
     are no repeating groups or arrays).
3.   Entries in a column (attribute, field)
     are of the same kind.
    Example Table 1
    CDS Table 1
CDID CDTITLE               TrackTitle         Artist   Artist
                                                       Country
1      Sergeant Pepper     Sergeant Pepper, Beatles    UK
                           Lucy in the Sky,
                           With a little Help
2      Blood on the Tracks Tangled up in      Dylan    US
                           Blue, Idiot Wind
       Another Example Table

CDID    CDTitle        Track1          Track2        Track3


1       Sergeant       Sergeant        Lucy in the   With a little
        Pepper         Peppers         sky with      help
                       lonely hearts   diamonds
                       club band
2       Blood on the   Tangled up      Idiot Wind    Bucket of
        Tracks         in Blue                       Rain
Normalizing
   The sample tables have repeating groups—ie
    the tracks associated with each CD.
   Each column must contain only a single value
   You also don’t want to find yourself
    numbering columns like track1, etc.
   The next table puts the sample table into first
    normal form
    First Normal Form Sample
CDID CDTITLE             TrackTitle           Artist    Artist
                                                        Country
1    Sergeant Pepper     Sergeant Pepper      Beatles   UK

1    Sergeant Pepper     Lucy in the Sky      Beatles   UK

1    Sergeant Pepper     With a little Help   Beatles   UK

2    Blood on the Tracks Tangled up in        Dylan     US
                         Blue
2    Blood on the Tracks Idiot Wind           Dylan     US
Second Normal Form
•   A table is in 2NF if it is in 1NF and if
    all non-key attributes are dependent
    on the candidate or primary key and
    nothing else.
•   This is called functional dependency
Normalizing. . .
   In our sample table there are really two
    separate things going on
   One is the CD information and one is
    the track information-
   To get all track information creates a lot
    of redundancy in the CD information
   Each should be dependent on their own
    key
        Second Normal Form Sample
    CDID      CDTitle
    1         Sergeant Pepper
    2         Blood on the Tracks


TrackID    TrackTitle           CDID   Artist    Artist Country
1          Sergeant Pepper      1      Beatles   UK

2          Lucy in the Sky      1      Beatles   UK

3          With a little help   1      Beatles   UK

4          Tangled up in Blue   2      Dylan     US

5          Idiot Wind           2      Dylan     US
Third Normal Form

 •   A table is in 3NF if it is in 2NF and
     if it has no transitive dependencies.
 •   This means that the non-primary
     key attributes don’t depend on
     each other.
 •   Look at our second sample table:
    Sample Table
TrackID   TrackTitle           CDID   Artist    Artist Country
1         Sergeant Pepper      1      Beatles   UK

2         Lucy in the Sky      1      Beatles   UK

3         With a little help   1      Beatles   UK

4         Tangled up in Blue   2      Dylan     US

5         Idiot Wind           2      Dylan     US
Normalizing
   There is a transitive dependency here
   Artist Country is dependent on Artist,
    not on TrackID which is the key field of
    the table
   The following tables resolve this:
Better
TrackID TrackTitle               CDID        ArtistID
1         Sergeant Pepper        1           1
2         Lucy in the Sky        1           1
3         With a little Help     1           1
4         Tangled up in Blue     2           2
5         Idiot wind             2           2

      ArtistID         Artist           ArtistCountry
      1                Beatles          UK
      2                Dylan            US
Summary
   Through the process of normalization
    our original table has become three
    tables, related by foreign keys:


    CDs(CDID, CDTitle)
    ARTISTS(ArtistID, Artist, ArtistCountry)
    TRACKS(TrackID, TrackTitle, CDID, ArtistID)
MORE…
1.   Boyce Codd Normal Form
2.   Fourth Normal Form
3.   Fifth Normal Form

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:1/26/2012
language:
pages:18