rdbms (PDF)

Document Sample
rdbms (PDF) Powered By Docstoc
					     Subject:          Relational Database Management Systems


              Functional Dependencies
Prepared by Mr. Shakil Ahmed
Department of Computer, SSUET
(Applicable to all sections)

                Steps of Database Design
  • Requirements Analysis
       – user needs; what must database do?
  • Conceptual Design
       – high level description (often done w/ER model)
  • Logical Design
       – translate ER into DBMS data model
  • Schema Refinement
             i t           li ti
       – consistency, normalization
  • Physical Design
       – indexes, disk layout
  • Security Design
       – who accesses what
       Limitations of E-R Designs

• Provides a set of guidelines, does not
  result in a unique database schema
• Does not provide a way of evaluating
  alternative schemas
• Normalization theory provides a
                         y p
  mechanism for analyzing and refining
  the schema produced by an E-R

Why Mapping from ER to Relation is not perfect?

• In the relational database model the
  relations must hold the following
  p p
  – Relations should have no duplicate
  – Tuples have no ordering associated with
    them and
  – each element in the relation is atomic.
• Relations that satisfy these basic
  requirements may still have some
  undesirable properties, for example,
  data    redundancy     and    update
• ERD is a one solution, but relations
  obtained from the ERD still requires
  some kind of check that relations does
  not have these types of problems
  when database is in operation.
• Goal of Relational Database Design
   – store information without unnecessary
   – retrieve information easily
• Bad design may lead to:
   – redundant       storage     (repetition   of
   – Update, insertion and deletion anomalies
   – inability to represent certain information

 Let us consider the following relation student.


85001   Smith   1, Main   CP302   Database   Gupta        102

85001   Smith   1, Main   CP303   Comm       Wilson       102

85001   Smith   1, Main   CP304   SW Engg    Williams     1024

85001   Jones   12, 7th   CP302   Database   Gupta        102
                      Undesirable Features
•     Repetition of information
•     Update Anomalies
•     Insertion Anomalies
•     Deletion Anomalies (Loss of Useful Information)

• The above relation may be easily decomposed into three
  relations to remove most of the above undesirable
      S (sno, sname, address)
        (cno, cname instructor
      C (cno cname, instructor, office)
      SC (sno, cno)
• This type of decomposition requires Normalization which
  is based on the concepts of Functional Dependencies
  which is used to define concepts of the “goodness” and
  the “badness” of the individual relational schemas. This
  analysis is done through FD.

                              Repeating Groups

    A repeating group is an attribute (or set of attributes) that can have
    more than one value for a primary key value

    Example: We have the following relation that contains staff and department details
    and a list of telephone contact numbers for each member of staff.
      staffNo job      dept   dname        city        contact number
      SL10 Salesman    10     Sales        Stratford   018111777, 018111888, 079311122
      SA51 Manager     20     Accounts     Barking     017111777
      DS40 Clerk       20     Accounts     Barking
      OS45 Clerk       30     Operations   Barking     079311555

     Repeating Groups are not allowed in a relational design, since all attributes
     have to be ‘atomic’ - i.e., there can only be one value per cell in a table!
          Functional Dependency
• Helps us normalizing the relations.
• A Functional dependency is a constraint between
  two sets of attributes from the database
• Consider a relation R that has two attributes A
  and B. The attribute B of the relation is
  functionally dependent on the attribute A if and
  only if for each value of A no more than one value
  of B is associated.

• A→B
   – This should be read as ‘A determines B’ or
                         y    p
     ‘B is functionally dependent on A’. A is
     called the determinant and B is called the
     object of the determinant, or we might say
     that A functionally determines B.
   – Values of B component depend on or
     determined by values of A component
• The abbreviation of the Functional
  dependency is F.D of f. d
• A FD should imply on all legal extensions
  of the Relational Schema R
                          Example I

SSN               ENAME             PNUMBER           HOURS
123456789         A, B              1                 32.5
                                    2                 7.5
666886789         C,D               3                 40
887266886         E,F               1                 20
                                    2                 20
737373737         H,I               2                 90
                                    4                 40

•The Following Functional Dependencies hold as by the semantics of the
     •SSN    ENAME
     •{SSN, PNUMBER}       HOURS
         And many other are possible
•Requires an algorithm or procedure to deduce all, since a relation can
have many attributes

• Hence an FD can not be inferred
  automatically from a given relation
  extension r but must be defined
  explicitly by someone who knows the
  semantics of the attributes R.
• Example:
   –The next slide clarifies this idea
                 Example II
            (Semantics of Attributes)

Teacher                Course              Text
Smith                  Data Structures     Bartram
Smith                  Data Management     Al-Nour
Hall                   Compilers           Hoffman
Brown                  Data Structures     Augenthaler
• Text  Course
• What about Course     Text
• What about Teacher     Course

• Compound Determinants:
• Full Functional Dependency:
   – X-->Y is a full functional dependency if the
     removal of any attribute A from X removes the
                       not X-{A} --> Y

  order#   line# qty      price
  A001     001    10       200
  A002     001    20       400      Full Functional Dependencies
  A002     002    20      800       (Order#, line#) → qty
  A004     001    15      300       (Order#, line#) → price
• Partial Functional Dependency
   – X-->Y is a partial dependency if some
     attribute A may be removed without
     removing the dependency
                  X-{A} --> Y

                             Full Functional Dependencies
 student# unit# room grade
                             (student#, unit#) → grade
 9900100 A01 TH224 2
 9900010 A01 TH224 14
 9901011 A02 JS075    3      Partial Functional Dependencies
 9900001 A01 TH224 16        unit# → room

        Super-key & Candidate Key

• K is a super-key for relation schema R
   – if and only if K → R
• K is a candidate key for R if and only if
  – K → R, and
  – for no X ⊂ K, X → R
     Diagrammatic Notation for FD

 Teacher          Course           Text

           Trivial Dependencies
• A functional dependency is trivial if and
  only if the RHS is a subset ( not necessary
  proper subset) of the LHS
  – E.g.
     • customer-name, loan-number → customer-name
     • customer-name → customer-name
  – In general, X → Y is trivial if Y ⊆ X
           Bs            As
• trivial: B’s subset of A’s
• Nontrivial: at least one of the B’s not
  among A’s
• Completely nontrivial: none of the B’s part
  of A’s
    Closure of Functional Dependencies (F+)
• Why?
  – The set of all functional dependencies logically implied by
    F is the closure of F (denoted by F+)
  – Example
                     (ENAME, BDATE, ADDRESS,
    NUMBER},               DNUMBER {DNAME,
  – Then we can infer the following additional functional
    dependencies from F
     SSN SSN
• To determine all of these we need a systematic way
  to infer new dependencies from a given set of
• Therefore Armstrong gives some rules to infer these
  called Armstrong Inference Rules or Armstrong

             Armstrong Inference Rules
• Reflexivity
   – if β ⊆ α, then α → β
   – This rules states that a set of the attributes always
     determines itself or any of its subsets
• Augmentation
   – if α → β, then γ α → γ β
   – This rule states that adding the same set of attributes
     to both the left and right hand sides of a dependency
     results in an another valid dependency
• Transitivity:
   – If α → β, and β → γ, then α → γ
     This l      t t th t F     ti  ld     d   i
   – Thi rules states that Functional dependencies are
• These rules are
   – sound (generate only functional dependencies that
     actually hold) and
   – complete (generate all functional dependencies that
                        Derived Rules
• The most important additional axioms are
   – Decomposition or Projective Rule
       • If α → β γ holds, then α → β holds and α → γ holds
       • This rule states that we can remove any attributes
         from the right hand side of the dependency;
         applying that this rule repeatedly can decompose
         the FD into the set of dependencies.
   – Union or Additive Rule
       • If α → β holds and α → γ holds, then α → β γ holds
       • This rule is the opposite of the above that we can
         combine the set of dependencies into a single FD
   – Pseudo-transitive Rule
       • If α → β holds and γ β → δ holds, then α γ → δ

• Let R = (A, B, C, G, H, I)
• Let
    – F = { A → B,
         A→C  C,
       CG → H,
       CG → I,
         B → H}
• some members of F+
    – A→H
       • by transitivity from A → B and B → H
    – AG → I
       • by augmenting A → C with G to get AG → CG and then transitivity
         with CG → I
    – CG → HI
       • from CG → H and CG → I : “union rule” can be inferred from
         definition of functional dependencies, or Augmentation of CG → I to
         infer CG → CGI, augmentation of G → H to infer CGI → HI, and
         then transitivity
     Algorithm for Attribute Closure

• An effective way to find out the closure of FDs
• The Algorithm is
   CLOSURE [k, S] := K;
   do “forever” ;
     For each FD X    Y in S
        If X is a subset of CLOSURE [K, S]
        then CLOSURE[K, S] = CLOSURE[K, S] UNION Y;
     if CLOSURE[K, S] did not change on this iteration
     /* Computation Complete */ leave loop ;


• Given the set of FDs compute the
  closure {A B}+ of the set of attributes
  {A, B} under this set of FDs.
  – FDs are
           A B,E CF, B E, CD EF
                    Example II
• R = (A, B, C, G, H, I)
• F = {A → B
       CG → H
       CG → I
       B → H}
• (AG)+
   1. result = AG
   2. result = ABCG (A → C and A → B)
   3. result = ABCGH      (CG → H and CG ⊆ AGBC)
   4. result = ABCGHI     (CG → I and CG ⊆ AGBCH)

          Uses of Attribute Closure
• Testing for Superkey
  – To test if X is a superkey, we compute X+, and check
    if X+ contains all attributes of R
• Testing functional dependencies
  – To check if a functional dependency X → Y holds (or,
    in other words, is in F+), just check if Y ⊆ X+.
  – That is, we compute X+ by using attribute closure, and
    then check if it contains Y.
  – Is a simple and cheap test, and very useful
• Computing closure of F
  – For each Z ⊆ R, we find the closure Z+, and for each
    S ⊆ Z+, we output a functional dependency Z → S.
   • Is AG a candidate key?
      1.Is AG a super key?
         1 Does AG → R? == Is (AG)+ ⊇ R
      2.Is any subset of AG a superkey?
         1.Does A → R? == Is (A)+ ⊇ R
         2.Does G → R? == Is (G)+ ⊇ R
• Given a relational schema R(A, B, C) and
  F {AB → C C → B} What are the keys of R?
  F={AB     C,       B}. Wh t     th k     f
  A+ = {A} B+ ={B} C+ = {C, B}
  {AB}+ = {A, B, C}     {AC}+ = {A,B,C}
  {BC}+ = {B,C} {A,B,C}+ = {A,B,C}
  Thus the keys for R are (AB) or (AC)

  • Given                      X          XF+
    F: AB → C
                               A       {A, D, E}
    A→D                        AB      {A, B, C, D, E}
    D→E                                     (Hence AB is a key)
    AC → B                     B       {B}
                               D        {D, E}

 Is AB → E a FD? Yes
 Is D→ C a FD?        No
 Result: XF+ allows us to determine FDs entailed by F of the

 form X → Y
• R = {A, B, C, D, E}
                                      • Is AD a key for R?
• Given                                 AD+ = AD
    – F = { B →CD, D → E, B → A,        AD+ = ABD and B is
            E → C, AD →B }              a key, so Yes!
• Is B → E in F+ ?
                                                    did t
                                      • I AD a candidate
       B+ = B
                                        key    for R?
       B+ = BCD
       B+ = BCDA                         A+ = A
       B+ = BCDAE … Yes! and B           A not a key, so Yes!
       is a key for R too!
•   Is D a key for R?                 • Is ADE a candidate
       D+ = D                           key for R?
       D+ = DE                              No! AD is a key,
       D+ = DEC                         so ADE is a
                                        superkey, but not a
                                        cand. key

                      Cover & Equivalence
• A set of functional dependencies E is said to be covered by
  a set of functional dependencies F, if every FD in E is also
  in F+. That is, every dependency in E can be inferred from
• Example
  F1 = { A → B A → C }
  F2 = { A → B, B → C }
  F1 is covered by F2.
• Two sets of functional dependencies E and F are said to be
  equivalent if E+ = F+. Hence, equivalence means that every
  FD in E can be referred from F, and every FD in F can be
  inferred from E.
    – Examples
      F1 and F2 are not equivalent.
      F1 = { A → B, A → C }
      F2 = { A → B, B → C }
     F3 and F4 are equivalent.
      F3 = { A → C, C → B, B → C}
      F4 = { A → B, B → C, C → B }
 Irreducible Sets (Minimal set or Canonical cover )
            of Functional Dependencies
• A set of FDs is set to be irreducible or minimal if
  and only if it satisfies the following three
     – The right-hand side (the dependent) of every FD in S
       involves just one attribute (i. e it is a singleton set)
     – The left-hand side ( the determinant) of every FD in S
       is irreducible in turn- meaning that no attribute can be
       discarded from the determinant without changing the
         l         (i.    ith t         ti       i t
       closure S+ (i e without converting S into some set not t t
       equivalent to S). We will say that such an FD is left-
     – No FD in S can be discarded from S without changing
       the closure S+ (i.e., without converting S into some set
       not equivalent to S)


• { A → B, A → C } is minimal
• {B → A C} is not minimal violating the first condition
                   minimal,                    condition.
• {A B → C, D → F} is minimal
• {A B → C, A → B} is not minimal, violating the third
• { A → B, B → C, A → C} is not minimal, violating the
  second condition.

    {A → B, B → C} is a minimal cover of { A → B, B → C, A → C}
    {A → C, A → B} is a minimal cover of {A B → C, A → B}.
• Given the relation R with attributes A, B, C, D and the
  FDs are
   – A     BC,  C,  B,
           BC B C A B AB            C AC D
• Compute irreducible set of dependencies
• Steps
   – The first step is rewrite the FDs such that each has a singleton
     right-hand side
         • A B, A C, B C, A B, AB C, AC D
         • Since FD A B occurs twice, so one occurrence can be eliminated.
   – Next attribute C can be eliminated from the L.H.S of the FD
     AC D because we have A C, so A AC by Augmentation,
     and we are given AC D, so A D by transitivity; thus the C on
     the left-hand side of AC D is redundant.
   – Next we observe that the FD AB C can be eliminated, because
     again we have A C, so AB CB by augmentation, so AB C by

   – Finally the FD A C is implied by the FDs A B and
     B C, so it can also be eliminated.
   – Therefore the final FDs are
         • A B
         • B C
         • A D
   – Which is the irreducible set
• Minimal cover isn't necessarily unique the algorithm to
  compute a minimal cover is non-deterministic (that is to
  say, it involves arbitrary choices)

• Find a minimal cover for F, where
  F is
  AB → C ACD → B CG → BD
    C → A D → EG         CE → AG
     BC → D     BE → C


Consider F = { AB → C, A → D, BD → C,
                       D → BG, AE → F}
Step 1:
F = { AB → C, A → D, BD → C, D → B, D → G, AE → F}
Step 2
The only FDs to be considered are
        AB → C, BD → C, AE → F.
For AB → C, we want to find whether A → C or B → C hold or not.
Because A → D, D → B ╞ A → B. Because A → B ╞ A → AB.
Because A → AB, AB → C ╞ A → C.
F - {AB → C} ∪ {A → C} ≡ F
           The Synthesis of Relations
• Given a set of attributes with certain
  functional dependencies, what relations
  should we form?
• Example: A and B are two attributes
  – If A     B and B     A
     • A and B have a one-to-one attribute relationship
  – If A     B but B not
             B, b t    t     A
     • A and B have a many-to-one attribute relationship
  – If A not    B and B not       A
     • A and B have a many-to-many attribute

                     Case Study
   A simplified COMPANY relational database schema.
Database state for the relational database
      Schema of Previous Figure

Two relation schemas suffering from update
We need to normalize this?

Shared By: