Docstoc

09

Document Sample
09 Powered By Docstoc
					Chapter 9



    Database
    Management
    Systems


             1
  Objectives for Chapter 9
 Problems in the flat-file approach to data management
   Why these gave rise to the database concept
 Relationships among the elements of the database environment
 Characteristics of the relational database model
 Three stages in database design including
   Conceptual design
   Logical design
   Physical design
 Anomalies caused by unnormalized databases and the need for data
  normalization
 Features of distributed databases
   Considerations in deciding on a particular database
     configuration


                                                                 2
 Flat-File Versus Database
        Environments
 Computer processing involves two components: data
  and instructions (programs).
 Conceptually, there are two methods for designing the
  interface between program instructions and data:
   File-oriented processing: A specific data file was
     created for each application
   Data-oriented processing: Creates a single data
     repository to support numerous applications
 Disadvantages of file-oriented processing include
  redundant data and programs and varying formats for
  storing the redundant data.
 The format for similar fields may vary because the
  programmer used inconsistent field formats.
                                                          3
               Flat-File Environment
User 1                                 Data
Transactions
                  Program 1            A,B,C
User 2
Transactions
                  Program 2
                                       X,B,Y
User 3
Transactions
                 Program 3
                                       L,B,M
      Data Redundancy &
       Flat-File Problems

Data Storage - creates excessive storage
 costs of paper documents and/or magnetic
 form
Data Updating - any changes or additions
 must be performed multiple times
Currency of Information - potential problem
 of failing to update all affected files
Task-Data Dependency - user’s inability to
 obtain additional information as his or her
 needs change
                                          5
               Database Approach
User 1
                                   Database
Transactions
                 Program 1
                                     A,
User 2
                             D       B,
Transactions                 B       C,
                 Program 2   M       X,
                             S       Y,
User 3                               L,
Transactions                         M
                 Program 3
  Advantages of the
  Database Approach
Data sharing/centralize database resolves flat-file problems:
 No data redundancy - Data is stored only once, eliminating
  data redundancy and reducing storage costs.
 Single update - Because data is in only one place, it
  requires only a single update procedure, reducing the time
  and cost of keeping the database current.
 Current values - A change to the database made by any
  user yields current data values for all other users.
 Task-data independence - As users’ information needs
  expand beyond their immediate domain, the new needs can
  be more easily satisfied than under the flat-file approach.
                                                                7
Disadvantages of the
 Database Approach
 Can be costly to implement
   additional hardware, software, storage, and
    network resources are required
 Can only run in certain operating
  environments
   may make it unsuitable for some system
    configurations
 Because it is so different from
  the file-oriented approach, the database
  approach requires training users
   may be inertia or resistance
                                                  8
      Elements of the Database Approach
                                  Database
             System Development   Administrator
             Process


                   Applications
                     User              DBMS
    Transactions
                     Programs         Data
                                      Definition     Host
U                                     Language       Operating
S   Transactions     User                            System
                                      Data
E                    Programs         Manipulation
R                                     Language
S   Transactions      User
                                      Query
                      Programs        Language       Physical
                                                     Database


              User Queries
           DBMS Features
User Programs - make the presence of the DBMS
 transparent to the user
Direct Query - allows authorized users to access
 data without programming
Application Development - user created
 applications
Backup and Recovery - copies database
Database Usage Reporting - captures statistics on
 database usage (who, when, etc.)
Database Access - authorizes access to sections of
 the database                                    10
    Internal Controls and
            DBMS

The purpose of the DBMS is to provide
 controlled access to the database.
The DBMS is a special software system
 programmed to know which data
 elements each user is authorized to
 access and deny unauthorized
 requests of data.


                                         11
    Data Definition Language
              (DDL)
DDL is a programming language used to define
 the database to the DBMS.
The DDL identifies the names and the relationship of
 all data elements, records, and files that constitute
 the database.
Viewing Levels:
  internal view - physical arrangement of records
    (1)
  conceptual view - representation of database
    (1)
  user view - the portion of the database each
    user views (many)                                 12
  ANSI Model

 External
User Views

Conceptual
  Model


 Internal
  Model



 Physical
 Database
      Data Manipulation
       Language (DML)

DML is the proprietary programming
 language that a particular DBMS uses to
 retrieve, process, and store data.
Entire user programs may be written in
 the DML, or selected DML commands can
 be inserted into universal programs, such
 as COBOL and FORTRAN.


                                             14
       Query Language

The query capability permits end users
 and professional programmers to access
 data in the database without the need for
 conventional programs.
IBM’s Structured Query Language (SQL)
 is a fourth-generation language that has
 emerged as the standard query language.


                                         15
Three Steps in Designing
      a Database
Prepare the conceptual model
  Identify the entities
  Identify the relationships between the entities
  Prepare the ER diagram
Specify the logical design
  Select the logical database model (relational)
  Transform the conceptual data model using a
   logical database model
Implement the physical design
  Physical structures
  Access methods                                    16
Phase 1


   Prepare the Conceptual
            Model
                   Draw an ERD to
                   capture the
                   process.



                              17
ER-Diagram Symbols
  ER-Diagram Symbols



Entity             Relationship   Attribute




     Primary Key
Example of a Relationship Linking
         Two Entities




           1                          M
CUSTOMER            places                  ORDER




                             Order Number     Item #

 Name      Number
               An Entity

...is an
   individual object
   concept
   event
...may be a specific
   tangible object
   intangible object
Entity Class is a collection of entities with
 similar attributes.
                                                 21
             Attributes

A property of an entity that we choose to record (of
interest to an organization).
CUSTOMER (entity)           PRODUCT (entity)
customer #                  product #
name                        description
address                     finish
telephone no.               price
balance                     qty. on hand

                                                22
               Cardinalities
Entity            Relationship       Entity

 Sales-    1                     1     Car
                    Assigned
 person


           1                     M
Customer            Places            Order




           M                     M
Vendor              Supply           Inventory
     ER-Diagram using REA Model

    R                                E                                   A
            M                  M               M                   1
                  Line items        Sales                                 Sales
Inventory                                          Party to
                                               M                         person
                                         M
                                                               1
                                    Pays for       Made to
                                                                       Customer
                                                               1
                                         M
                                               M    Received
            1                  M     Cash             from
                Increases
  Cash                             Collections M               1        Cashier
                                                   Received                34
                                                      by
Phase 2

     Specify the Logical
      Design

                    Create relational
                    tables.




                                 25
  Logical Data Structures

A particular method used to organize records
 in a database is called the database’s
 structure.
The objective is to develop this structure
 efficiently so that data can be accessed
 quickly and easily.
Four types of structures are:
  hierarchical (AKA the tree structure)
  network
  relational
  object-oriented                          26
    The Relational Model

The relational model portrays data in the
 form of two dimensional tables:
  relation - the database table
  attributes (data elements) - form
   columns
  tuples (records) - form rows
  data - the intersection of rows and columns


                                                 27
RESTRICT - filtering out rows,   PROJECT - filtering out columns,
such as the purple               such as the purple




JOIN


  X1    Y1          Y1     Z1         X1     Y1    Z1

  X2    Y2          Y2     Z2         X2     Y2    Z2

  X3    Y1          Y3     Z3         X3     Y1    Z1
Properly Designed Relational Tables

No repeating values - All occurrences at
 the intersection of a row and column are a
 single value.
The attribute values in any column must all
 be of the same class.
Each column in a given table must be
 uniquely named.
Each row in the table must be unique in at
 least one attribute, which is the primary key.
                                            29
       Relational Model Data
        Linkages (>1 table)

 No explicit pointers are present. The data are viewed as a
  collection of independent tables.
 Relations are formed by an attribute that is common to
  both tables in the relation.
 Assignment of foreign keys:
   if 1 to 1 association, either of the table’s primary keys may
     be the foreign key.
   if 1 to many association, the primary key on one of the
     sides is embedded as the foreign key on the other side.
   if many to many association, may embed foreign keys or
     create a separate linking table.

                                                            30
    Three Types of Anomalies

 Insertion Anomaly: A new item
  cannot be added to the table until
  at least one entity uses a
  particular attribute item.
 Deletion Anomaly: If an
  attribute item used by only one
  entity is deleted, all information
  about that attribute item is lost.
 Update Anomaly: A modification
  on an attribute must be made in
  each of the rows in which the
  attribute appears.
 Anomalies can be corrected by
  creating relational tables.

                                       31
 Advantages of Relational
         Tables
Removes all three anomalies
Various items of interest (customers,
 inventory, sales) are stored in separate
 tables.
Space is used efficiently.
Very flexible. Users can form ad hoc
 relationships.

                                            32
The Normalization Process
A process which systematically splits
 unnormalized complex tables into
 smaller tables that meet two conditions:
  all nonkey (secondary) attributes in the table
   are dependent on the primary key
  all nonkey attributes are independent of the
   other nonkey attributes
When unnormalized tables are split and
 reduced to third normal form, they must
 then be linked together by foreign keys.
                                                    33
Steps in Normalization
Table with
repeating groups     Remove
                    repeating
                     groups
First normal
form 1NF
                     Remove
                      partial
                   dependencies
Second normal
form 2NF
                      Remove
                     transitive
                   dependencies
Third normal
form 3NF
                       Remove
                      remaining
Higher normal         anomalies
forms
    Accountants and Data
       Normalization
The update anomaly can generate conflicting and
  obsolete database values.
The insertion anomaly can result in unrecorded
  transactions and incomplete audit trails.
The deletion anomaly can cause the loss of
  accounting records and the destruction of audit
  trails.
 Accountants should have an understanding of the
 data normalization process and be able to
 determine whether a database is properly
 normalized.
                                             35
Phase 3


  Implement the Physical
         Design

                   Decide about
                   software and
                   hardware.



                                  36
Physical Database Design

Transition from theoretical to physical
 aspects of database
  IS  IT
Decisions about software and hardware
  Implementation
Populate the database with data
Produce physical user views (multiple)

                                           37
        Data Structures

allow records to be located, stored, and
 retrieved and allow movement through
 the database. Two components:
  The organization of a file is the physical
   arrangement of records.
  The access method is the technique used
   to locate records and to navigate through the
   database.

                                               38
Distributed Data
Processing


                   Central            Centralized
                   Site                Database




      Site A                 Site B      Site C
       Distributed Data
          Processing
DP is organized around several
 information processing units (IPUs)
 distributed throughout the organization
 and placed under the control of the
 end users.
DDP does NOT mean
 Decentralization!
  IPUs are connected to
   one another and
   coordinated.
                                           40
  Potential Advantages of
            DDP

Cost reductions in hardware and data
 entry tasks
Improved cost control responsibility
Improved user satisfaction since
 control is closer to the user level
Backup of data can be improved
 through the use of multiple data storage
 sites
                                            41
Potential Disadvantages of
            DDP

Loss of control
Mismanagement of organization-wide
 resources
Hardware and software incompatibility
Redundant tasks and data
Consolidating incompatible tasks
Difficulty attracting qualified personnel
Lack of standards                           42
Centralized Databases in DDP
         Environment

The data is retained in a central
 location.
Remote IPUs send requests for data.
Central site services the needs of the
 remote IPUs.
The actual processing of the data is
 performed at the remote IPU.

                                          43
         Data Currency

Occurs in DDP with a centralized database
During transaction processing, the data
 will temporarily be inconsistent as a
 record is being read and updated.
Database lockout procedures are
 necessary to keep IPUs from reading
 inconsistent data and from writing over a
 transaction being written by another IPU.

                                         44
      Distributed Databases:
            Partitioning
Splits the central database into segments that
 are distributed to their primary users
Advantages:
  users’ control is increased by having data
   stored at local sites
  transaction processing response time is
   improved
  the volume of transmitted data between IPUs is
   reduced
  reduces the potential data loss from a
   disaster
                                              45
The Deadlock Phenomenon

Especially a problem with partitioned
 databases
Occurs when multiple sites lock each
 other out of data that they are currently
 using
  One site needs data locked by another site.
Special software is needed to analyze and
 resolve conflicts.
  Transactions may be terminated and have to be
   restarted.
                                                 46
The Deadlock Phenomenon
Locked A, waiting for C          Locked E, waiting for A

A,B
                                                     E, F




              C,D

                 Locked C, waiting for E
                                                            47
   Distributed Databases:
         Replication

The duplication of the entire
 database for multiple IPUs
This method is effective for situations with
 a high degree of data sharing, but no
 primary user, and supports read-only
 queries.
The data traffic between sites is reduced
 considerably.

                                           48
Concurrency Problems and
     Control Issues
Database concurrency is the presence
 of complete and accurate data at
 all IPU sites. With replicated
 databases, maintaining current data at
 all locations is a difficult task.
Time stamping may be used to
 serialize transactions and to prevent
 and resolve any potential conflicts
 created by updating data at various
 IPUs.
                                          49

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:11/27/2011
language:English
pages:49