Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

9Ch07Log

VIEWS: 3 PAGES: 27

									           Unit 7
Logical Database Design (補)
                                                      Contents
                                        7.1 Introduction

                                        7.2 Functional Dependency

                                        7.3 Normal Forms
                                               First (1NF)
                                               Second (2NF)
                                               Third (3NF)




Wei-Pang Yang, Information Management, NDHU                          7-2
7.1 Introduction
   Logical Database Design vs.
    Physical Database Design
   Problem of Normalization
   Normal Forms


                                  7-3
                                Logical Database Design
                     Logical Database Design vs. Physical Database Design
                            • Logical Database Design: How to organize relations/ tables?
                            • Physical Database Design: How to store data in disk storage?

                     Logical Database Design
                            • Normalization
                            • Semantic Modeling/ ER Model/ …
                     Problem of Normalization/ Logical Database Design
                            • Given some body of data to be represented in a database,
                            • how to decide the suitable logical structure they should have?
                                    • what relations should exist?
                                    • what attributes should they have?
Wei-Pang Yang, Information Management, NDHU                                                    7-4
                                    Problem of Normalization
<e.g.>                                       S1, Smith, 20, London, P1, Nut, Red, 12, London, 300
                                             S1, Smith, 20, London, P2, Bolt, Green, 17, Paris, 200
                                                                                   .
                                                              .
                                             S4, Clark, 20, London, P5, Cam, Blue, 12, Paris, 400

                                                                                             Normalization
                           S                                           P                                 SP
                            S# SNAME STATUS CITY                       P#        ...          ...           ...             S#      P#        QTY
                            s1       .           .       London        .                .               .         .         .        .         .
                            .        .           .       .             .                .               .         .         .        .         .



                           S'                                     P                                                   SP'
                            S# SNAME             STATUS           P#       ...         ...     ...                    S#    CITY         P#    QTY
         or                                                                                                           S1 London          P1    300
                            S1      Smith            .
                            S2           .           .            .         .          .            .                 S1 London          P2    200
                                .        .           .            .         .          .            .                 .         .        .         .

                                               Redundancy                              Update Anomalies! (異常)
Wei-Pang Yang, Information Management, NDHU                                                                                                            7-5
                                                 Normal Forms
                        A relation is said to be in a particular normal form if it satisfies
                             a certain set of constraints.
                       <e.g.> 1NF: A relation is in First Normal Form (1NF) iff it
                                   contains only atomic values.

                                   universe of relations (normalized and un-normalized)
                                          1NF relations (normalized relations)

                                              2NF relations
                                                3NF relations
                                                  BCNF relations
                                                   4NF relations
                                                     5NF relations




Wei-Pang Yang, Information Management, NDHU          Fig. 7.1: Normal Forms                     7-6
7.2 Functional Dependency
      Functional Dependency (FD)
      Fully Functional Dependency (FFD)




                                           7-7
                                      Functional Dependency
         Functional Dependency
           • Def: Given a relation R, R.Y is functionally dependent on R.X iff
                    each X-value has associated with it precisely one Y-value (at any
                    time).
                • Note: X, Y may be the composite attributes.               R
         Notation:                                    .       .                X   Y
                                                       .       .
                        R.X              R.Y
                                                       .

                         read as "R.X functionally determines R.Y"




Wei-Pang Yang, Information Management, NDHU                                             7-8
                                Functional Dependency (cont.)
             <e.g.1>
                                   S
                                   S.S#        S.SNAME            S   S#   SNAME STATUS CITY
                                                                      S1   Smith   20   London
                                   S.S#        S.STATUS               S2   Jones  10     Paris
                                   S.S#        S.CITY                 S3   Blake   30    Paris
                                                                      S4   Clark  20    London
                                   S.STATUS    S.CITY                 S5   Adams  30    Athens

                                                                  Note: Assume STATUS is some factor of Supplier
                     FD Diagram:                                        and no any relationship with CITY.



                                          S#              STATUS



                                       SNAME               CITY



Wei-Pang Yang, Information Management, NDHU                                                              7-9
                                Functional Dependency (cont.)
                        <e.g.2> P
                                                         PNAME          P     P#   PNAME   COLOR    WEIGHT    CITY
                                                                              P1   Nut     Red        12     London
                                                                              P2   Bolt    Green      17     Paris
                                                        COLOR                 P3   Screw   Blue       17     Rome
                                     P#                                       P4   Screw   Red        14     London
                                                                              P5   Cam     Blue      12      Paris
                                                        WEIGHT                P6   Cog     Red       19      London

                                                          CITY
                        <e.g.3> SP
                                                                                            SP S#       P# QTY
                                                                                                   S1   P1 300
                                                                                                   S1   P2 200
                                              S#                                                   S1   P3 400
                                                                 QTY                               S1   P4 200
                                                                                                   S1   P5 100
                                              P#                                                   S1   P6 100
                                                                                                   S2   P1 300
                                                                                                   S2   P2 400
                                                                                                   S3   P2 200
                          If X is a candidate key of R, then all attributes Y                     S4   P2 200
                                                                                                   S4   P4 300
                              of R are functionally dependent on X. (i.e. X         Y)             S4   P5 400


Wei-Pang Yang, Information Management, NDHU                                                                      7-10
                               Functional Dependency (cont.)
        <Note>
                       1. FD is a semantic notion.
                          <e.g.> S#            CITY
                                     Means: each supplier is located in precisely one city.
                       2. FD is a special kind of integrity constraint.
                            CREATE INTEGRITY RULE SCFD
                                   CHECK FORALL SX FORALL SY
                                       (IF SX.S# = SY.S# THEN SX.CITY = SY.CITY);

                       3. FDs considered here applied within a single relation.
                          <e.g.> SP.S#             S.S# is not considered!




Wei-Pang Yang, Information Management, NDHU                                                   7-11
7.3 First, Second, and Third Normal
         Forms (1NF, 2NF, 3NF)




                                      7-12
                                              Normal Forms: 1NF
         Def: A relation is in 1NF iff all underlying simple domains contain atomic
             values only.

                                                                               FIRST
                                                                                S#   STATUS   CITY     P#   QTY
                                                                               S1      20     London   P1   300
    S# STATUS               CITY              (P#, QTY)                        S1      20     London   P2   200
                                                                               S1      20     London   P3   400
                                                                               S1      20     London   P4   200
     S1       20          London      {(P1, 300), (P2, 200), ..., (P6, 100)}   S1      20     London   P5   100
     S2       10          Paris       {(P1, 300), (P2, 400)}                   S1      20     London   P6   100
     S3       10          Paris       {(P2, 200)}                              S2      10     Paris    P1   300
     S4       20          London      {(P2, 200), (P4, 300), (P5, 400)}        S2      10     Paris    P2   400
                                                                               S3      10     Paris    P2   200
                                                                               S4      20     London   P2   200
                                                                               S4      20     London   P4   300
          Suppose 1. CITY is the main office of the supplier.                  S4      20     London   P5   400
                  2. STATUS is some factor of CITY
                                                                               Key:(S#,P#),
                                                                               Normalized 1NF


Wei-Pang Yang, Information Management, NDHU                                                                       7-13
                    1NF Problem: Update Anomalies!
         <1> Update
                                                                      FIRST
             If suppler S1 moves from London to Paris, then 6          S#   STATUS   CITY     P#    QTY
             tuples must be updated!
                                                                      S1      20     London   P1    300
         <2> Insertion                                                S1      20     London   P2    200
                                                                      S1      20     London   P3    400
             Cannot insert a supplier information if it doesn't       S1      20     London   P4    200
             supply any part, because that will cause a null key      S1      20     London   P5    100
                                                                      S1      20     London   P6    100
             value.                                                   S2      10     Paris    P1    300
                       FIRST                                          S2      10     Paris    P2    400
                                                                      S3       10    Paris    P2    200
                        S# STATUS CITY         P#  QTY                S4      20     London   P2    200
                                                                      S4      20     London   P4    300
                         .   .      .           .    .                S4      20     London   P5    400
                        S3   20    Paris        P2  300
                         .   .      .           .    .                Key:(S#,P#),
                         .   .      .           .    .                Normalized 1NF
                        S5   30    Athens     NULL NULL
         <3> Deletion
                 Delete the information that "S3 supplies P2", then
                 the fact "S3 is located in Paris" is also deleted.


Wei-Pang Yang, Information Management, NDHU                                                        7-14
             1NF Problem: Update Anomalies! (cont.)
                                                                               FIRST
          <e.g.> Suppose 1. CITY is the main office of the supplier.             S#   STATUS       CITY     P#   QTY
                                   2. STATUS is some factor of CITY              S1     20         London   P1   300
                                                                                 S1     20         London   P2   200
                      Primary key of FIRST: (S#, P#)                             S1     20         London   P3   400
                                                                                 S1     20         London   P4   200
                                                       FD                        S1     20         London   P5   100
              FD diagram of FIRST:                                               S1     20         London   P6   100
                                                      x FD                       S2     10         Paris    P1   300
                                                                STATUS           S2     10         Paris    P2   400
                                 FF           S#                                 S3     10         Paris    P2   200
                 QTY                                     x FD                    S4     20         London   P2   200
                                 D                                               S4     20         London   P4   300
                                              P#                 CITY            S4     20         London   P5   400
                                                        FD
                                                                                Key:(S#,P#),
                                                                                Normalized 1NF
                            FD:
                            1. S#     STATUS
                            2. S#     CITY
                                                                      primary key (S#, P#)    FD
                                                                                                    STATUS
                            3. CITY     STATUS
                            4. (S#, P#)   QTY                         primary key (S#, P#)   FD
                                                                                                    CITY



Wei-Pang Yang, Information Management, NDHU                                                                      7-15
                                               Normal Form: 2NF
         Def: A relation R is in 2NF iff
                  (1) R is in 1NF (i.e. atomic )
                  (2) Non-key attributes are FD on primary key. (e.g. QTY, STATUS, CITY in FIRST)
             <e.g.> FIRST is in 1NF, but not in 2NF                                                   FD

                 (S#, P#)                STATUS, and                                                 x FD
                                FD                                                                            STATUS
                 (S#, P#)                CITY                                    FFD        S#
                                FD                                       QTY                           x FD
                                              Decompose FIRST into:                         P#                    CITY
                                                                                                       FD

               <1> SECOND (S#, STATUS, CITY):                                <2> SP (S#, P#, QTY):
                        primary key: S#                                          Primary key: (S#, p#)
                                              STATUS      FD:
                                                                                       S#
                                                          1. S#   STATUS                                    QTY
                    S#
                                                                                       P#
                                                CITY      2. S#   CITY
                                                          3. CITY   STATUS         FD: 4. (S#, P#)    QTY

Wei-Pang Yang, Information Management, NDHU                                                                          7-16
                                      Normal Form: 2NF (cont.)
                FIRST
                  S#      STATUS        CITY     P#   QTY         SECOND (in 2NF)
                  S1        20          London   P1   300          S# STATUS CITY
                  S1        20          London   P2   200          S1        20        London
                  S1        20          London   P3   400          S2        10        Paris
                  S1        20          London   P4   200          S3        10        Paris
                  S1        20          London   P5   100          S4        20        London
                  S1        20          London   P6   100          S5        30        Athens
                  S2        10          Paris    P1   300
                  S2        10          Paris    P2   400
                  S3        10          Paris    P2   200
                  S4        20          London   P2   200
                  S4        20          London   P4   300          SP (in 2NF)
                  S4        20          London   P5   400
                                                                        S#        P#     QTY)
                                                                        S1        P1     300
                 <1> Update: S1 moves from London to Paris              S1        P2     200
                                                                        S1        P3     400
                                                                        S1        P4     200
                 <2> Insertion: (S5 30 Athens)                          S1        P5     100
                                                                        S2        P1     300
                 <3> Deletion                                           S2        P2     400
                                                                        S3        P2     200
                     Delete "S3 supplies P2 200", then the fact         S4        P4     300
                                                                        S4        P5     400
                     "S3 is located in Paris" is also deleted.

Wei-Pang Yang, Information Management, NDHU                                                     7-17
                                      Normal Form: 2NF (cont.)
         A relation in 1NF can always be reduced to an equivalent collection of 2NF relations.
         The reduction process from 1NF to 2NF is non-loss decomposition.
                    FIRST(S#, STATUS, SCITY, P#, QTY)
                                                                    1st
                 projections                  natural joins

                                                              2nd         SP
        SECOND(S#, STATUS, CITY), SP(S#,P#,QTY)

         The collection of 2NF relations may contain “more” information than the equivalent
             1NF relation.
            <e.g.>       (S5, 30, Athens)




Wei-Pang Yang, Information Management, NDHU                                                    7-18
              Problem: Update Anomalies in SECOND!
                 • Update Anomalies in SECOND
                         <1> UPDATE: if the status of London is changed from 20
                             to 60, then two tuples must be updated
                         <2> DELETE: delete supplier S5, then the fact "the
                            status of Athens is 30" is also deleted!                     SECOND (in 2NF)
                                                                                          S# STATUS CITY
                         <3>INSERT: cannot insert the fact "the status of Rome
                                                                                          S1   20   London
                            is 50"!                                                       S2   10   Paris
                                                                                          S3   10   Paris
                 • Why:                                                                   S4
                                                                                          S5
                                                                                               20
                                                                                               30
                                                                                                    London
                                                                                                    Athens

                          S.S#                S.STATUS
                          S.S#                S. CITY
                          S.CITY              S.STATUS cause a transitive dependency
                                                                                         FD:
                                                                                STATUS   1. S#   STATUS
                                                                 S#                      2. S#   CITY
                                                                                  CITY   3. CITY   STATUS

Wei-Pang Yang, Information Management, NDHU                                                                  7-19
                                              Normal Forms: 3NF
         Def : A relation R is in 3NF iff
                        (1) R is in 2NF
                        (2) Every non-key attribute is non-transitively dependent on the primary key.
                             e.g. STATUS is transitively on S#
                            (i.e., non-key attributes are mutually independent)
                  <e.g.> SP is in 3NF, but SECOND is not!


                                                      SECOND (not 3NF)
                        SP FD diagram                                          SECOND FD diagram
                                                        S# STATUS CITY
                                                        S1   20   London
                                                                                             STATUS
                             S#                         S2   10   Paris
                                               QTY      S3   10   Paris       S#
                             P#                         S4   20   London
                                                        S5   30   Athens                      CITY




Wei-Pang Yang, Information Management, NDHU                                                             7-20
                                     Normal Forms: 3NF (cont.)
          Decompose SECOND into:                                     SECOND
              <1> SC(S#, CITY)                                        S# STATUS CITY
                            primary key : S#                          S1   20   London
                            FD diagram:                               S2   10   Paris
                                                                      S3   10   Paris
                                       S#         CITY                S4   20   London
                                                                      S5   30   Athens
              <2> CS(CITY, STATUS):
                     primary key: CITY
                     FD diagram:                                                STATUS
                                    CITY        STATUS                S#         CITY

                          SC (in 3NF)          CS (in 3NF)
                           S# CITY
                                               CITY          STATUS
                            S1 London
                            S2 Paris            Athens         30
                            S3 Paris            London         20
                            S4 London           Paris          10
                            S5 Athens           Rome           50



Wei-Pang Yang, Information Management, NDHU                                              7-21
                                     Normal Forms: 3NF (cont.)
                        Note:
                                (1) Any 2NF diagram can always be reduced to a collection
                                    of 3NF relations.
                                (2) The reduction process from 2NF to 3NF is non-loss
                                    decomposition.
                                (3) The collection of 3NF relations may contain "more
                                    information" than the equivalent 2NF relation.




Wei-Pang Yang, Information Management, NDHU                                                 7-22
                           Good and Bad Decomposition
         Consider                       transitive FD     STATUS              Suppose 1. CITY is the main office of the supplier.
                                    S#                                                 2. STATUS is some factor of CITY
                                                             CITY

                                                                       STATUS
                        STATUS
                                                           S#            CITY
          S#              CITY

        ①Decomposition A:                                  ② Decomposition B:
                                                                                                     ③ Decomposition C:

 SC:                                                 SC:
             S#               CITY                              S#           CITY                       S#    -> status
 CS:
          CITY               STATUS
                                                     CS:
                                                                S#           STATUS
                                                                                                        city -> status

                    Good !                                           Bad !                             ‘Good’ or ‘Bad’ is
       (Rome, 50) can be inserted.                  (Rome, 50) can not be inserted                     dependent on item’s
                                                    unless there is a supplier located at              semantic meaning!!
                                                    Rome.

Wei-Pang Yang, Information Management, NDHU                                                                                     7-23
                                              Why Normal Form?
                      Avoid update anomalies
                      Consider the SSP(S#, SNAME, P#, QTY)
                             • Common sense will tell us
                                       SS(S#, SNAME) & SP(S#, P#, QTY) is a better design.
                      The concepts of FD, 1NF, 2NF, 3NF and BCNF to formalize common
                          sense.
                      Mechanization is possible!
                        • i.e., we can write a program to do the work of normalization for us!




Wei-Pang Yang, Information Management, NDHU                                                      7-24
                                          Concluding Remarks
         The technique of non-loss decomposition is an aid to logical database design .
         The overall processes of Normalization:
                • step1: eliminate non-full dependencies.
                • step2: eliminate any transitive FDs.
                • step3: eliminate those FDs in which the determinant is not a candidate key.
         General objective:
                • reduce redundancy, and then
                • avoid certain update anomalies.
         Normalization Guidelines are only guidelines.
                • Sometime there are good reasons for not normalizing all the way.


Wei-Pang Yang, Information Management, NDHU                                                7-25
                                   Concluding Remarks (cont.)
        <e.g.> NADDR (NAME, STREET, CITY, STATE, ZIP)
                                                       STREET

                                                       CITY              not in 5NF
                                 NAME
                                                                         (in which NF?)
                                                       STATE


                                                        ZIP

                                                                decompose

                        NSZ (NAME, STREET, ZIP)                 ZCS (ZIP, CITY, STATE)

                            NAME              STREET               ZIP           CITY

                                               ZIP                              STATE


Wei-Pang Yang, Information Management, NDHU                                               7-26
                                   Concluding Remarks (cont.)
         However, (1) STREET, CITY, STATE are almost required together.
                                 (2) so ZIP do not change very often, such a decomposition seems unlikely
                                     to be worthwhile.
         Not all redundancies can be eliminate by projection.




Wei-Pang Yang, Information Management, NDHU                                                             7-27

								
To top