Docstoc

Week07

Document Sample
Week07 Powered By Docstoc
					     Lecture No. 7
An Introduction to Database




                     CSE1720 Semester 1 2005 Database Week 7 / 1
                    April Events
Wednesday 21st April :Semester 1 ‘HECS liability
statement’ forwarded to HECS liable students




                                  CSE1720 Semester 1 2005 Database Week 7 / 2
              Objectives of this Lecture
  This lecture will be focussed mostly on database as a
  means of storing and retrieving data.

  The term ‘database’ is always associated with the term
  ‘database management system’ (DBMS) which indicates that
  not only is data stored but it is controlled and managed by
  software.

• You will (probably) be using Microsoft Access for your
  assignment. This DBMS is aimed at the lower end of data
  volumes and processing.
• You will also use Oracle in your laboratory work. This is a
  medium to large client-server DBMS

                                      CSE1720 Semester 1 2005 Database Week 7 / 3
CSE1720 Semester 1 2005 Database Week 7 / 4
            Before I start the ‘hard sell’
United Airlines is the world’s largest air carrier

It manages many complex operations

The company has nearly 60 separate applications from back
office services to customer-facing Web applications which
run on 9 different database platforms

The Company had IT staff cutbacks in 2001, recognised the
need for a serious plan for simplification, and has now
started on a multiyear process of migrating and
consolidating all the systems on two database platforms

                                      CSE1720 Semester 1 2005 Database Week 7 / 5
                 Some Aspects of Data
A major benefit of Computing is Storage and Retrieval of Data
We need to have answers to these questions -
   WHAT are we going to produce as ‘output’ ?

   WHAT data is to be stored ?

    WHAT is the level of detail (Name, Address, Height …?)

    HOW long is to be retained - and is it ‘volatile’?

    WHAT form is it to be stored - on line, off line ?

    WHO is going to access it ?


                                       CSE1720 Semester 1 2005 Database Week 7 / 6
            Some Aspects of Data
HOW often will it be accessed ?

WHAT security of access is to be applied ?

WHAT are the starting volumes - what is the growth or
   decay factor ?

WHAT response time is expected ?

HOW accurate is the data content ?

HOW current is it ? (e.g. on line sales retailing)
                                   CSE1720 Semester 1 2005 Database Week 7 / 7
                   Information / Data

A General Definition:

DATA - raw (unprocessed or partly processed) facts which
         represent the state of entities (things) which have
         occurred

INFORMATION - data which has been processed into a form
                 USEFUL TO THE USER


What is Information to one user may be Data to another user.



                                      CSE1720 Semester 1 2005 Database Week 7 / 8
                         Audit Trail
General Definition:
    ‘The presence of data processing media and procedures
     which allow any and / or all transaction(s) to be traced
     through ALL STAGES of processing’

This infers that the following devices / techniques are in place:
  1. A logging device which ‘traps’ all transactions
  2. Some way of tagging each transaction so that it can be
      identified
  3. Some way of retrieving the required transaction(s)
  4. Some way of archiving - what is the required period ?
  5. Control procedures and processes to ensure integrity

                                       CSE1720 Semester 1 2005 Database Week 7 / 9
                       Data Base

A Database is a shared collection of Inter-Related data
designed to meet the needs of multiple types of users and
applications.
This implies that multiple user VIEWS can be defined

Data stored is independent of the programs which use it

Data is structured to provide a basis for future applications

DATABASE = Stored Collection of Related Data
         May be physically distributed


                                     CSE1720 Semester 1 2005 Database Week 7 / 10
          Database Management System
A DBMS is SOFTWARE which provides access to the database in
  an integrated and controlled manner

A DBMS must contain :
     1. Data Definition and Structure capabilities
     2. Data Manipulation capabilities




                                       CSE1720 Semester 1 2005 Database Week 7 / 11
        Data Definition and Manipulation
Data Definition Language (DDL)
     used to describe data at the database level
     Schema level - complete database description
     Sub-Schema level - user views (restricted)

Data Manipulation Language (DML)
     Provides for    Create          Insert
                     Update          Retrieve (extract)
                     Delete          Drop
                     Modify          Calculation (limited)
                     Report
                                               capabilities
                                   CSE1720 Semester 1 2005 Database Week 7 / 12
     Three Level Architecture

1. External Schema    - User Views

2. Conceptual Schema - Total database
                             description

3. Internal Schema     - Physical database




                           CSE1720 Semester 1 2005 Database Week 7 / 13
           The Many Faces of Database
Databases can be:

     1. Transaction Intensive    -ATM’s Checkouts
     2. Decision Support         - Browsing for trends
     3. Mixed-Load              - Combination of both
     4. Small databases         -  Few thousand records
     5. Very Large Database     - Many millions or trillions
          (VLDB)                  of records (Banks)
     6. Non Traditional         - Weather bureau, flight plans
                                  Computer Aided Design data



                                       CSE1720 Semester 1 2005 Database Week 7 / 14
           The Many Faces of Database
• They can be:

   Data Warehouses
   Data Marts and Data Martlets

• How is a database size measured ?
  There are a number of ‘measurements’
    Raw data size
    Total database size
    Total usable disk space size (which includes media
    protection such as mirroring)


                                    CSE1720 Semester 1 2005 Database Week 7 / 15
         The Many Faces of Database



Hardware        Database   Raw Data          Total Disk

HP9000         Oracle      100GB             643GB
Digital 8400   Oracle      100GB             361GB

IBM SP2        DB2/6000    100GB             377GB

NCR5100        Teradata    100GB            880GB

NCR5100        Teradata    1,000GB           3,280GB


                              CSE1720 Semester 1 2005 Database Week 7 / 16
                 DBMS Capabilities

    Querying Capabilities

    Data Displays (Presentation)

    Data entry

    Data Validation

    Data Deletion

    Committing Procedures (of changes)

AND Data Integrity, Security, Consistency and Concurrency
    Capabilities

                                   CSE1720 Semester 1 2005 Database Week 7 / 17
           Important Database Features
• Data Integrity
• Data Independence
• Referential Integrity - Relational Database Model
• Concurrency Control - Multiple Users
• Consistency
             - multi users
             - distributed database
             - replicated database
             - partitioned database
             - mobile database
• Recovery from failure (Transaction and Media)
• Security
                                     CSE1720 Semester 1 2005 Database Week 7 / 18
            File Processing (non database)

                                    Purchasing
      Billing Program
                                    Program



Customers     Accounts     Buyers        Inventory         Vendors
              Receivable



  Accounts Payable         Sales Order                   Payroll
                           Processing




 Vendors      Invoice      Customers     Inventory         Employee

                                          CSE1720 Semester 1 2005 Database Week 7 / 19
   File Processing With Database Technology
  Orders Department                 Accounting Department


Program     Program       Program      Program          Program
   A           B            C             A               B



          Order Filling                          Invoicing
           system                                 system



Customers    Inventory    Back           Inventory              Customer
              Master      Orders         Pricing                Master


                                          CSE1720 Semester 1 2005 Database Week 7 / 20
File Processing With Database Technology


        Payroll Department

     Program         Program
        A               B



               Payroll
               System



               Employee
               Master


                             CSE1720 Semester 1 2005 Database Week 7 / 21
              InterRelated File Outline



Salesperson   Sales              Customer             Accounts
              Stats                                   Receivable




  Buyers        Inventory        Vendor                 Accounts
                                                        Payable




                      Purchase        General
                      Order           Ledger

                                     CSE1720 Semester 1 2005 Database Week 7 / 22
           Conceptual Data Model

                 customer


order                                   invoice



                 product                           work order



raw material                vendor


                            CSE1720 Semester 1 2005 Database Week 7 / 23
                     User Views


customer           customer
                   customer                product                  vendor




  order    order              invoice                  raw material




                                        CSE1720 Semester 1 2005 Database Week 7 / 24
               Database Models - Hierarchical

   owner/parent
                                                                       owner



   child /parent
                                                                        member




child                      child/parent




                                          CSE1720 Semester 1 2005 Database Week 7 / 25
                  Data Base Models - Network

‘set’ of data
                                                  owner




                                                                            member
                                                                             owner




                                                      member

    set of data            Note: Only linked sets can be accessed
                                         CSE1720 Semester 1 2005 Database Week 7 / 26
          Data Base Models - Relational

 table          table          table                table                table
   A              B              C                     D                    E




Any table(s) can be joined to any other table(s), provided
there is a means of effecting the join
Primary key / Foreign key concept.
Minimal Data redundancy
No fixed linkages
                                       CSE1720 Semester 1 2005 Database Week 7 / 27
                       Definition of a
                    Relational Database
 • A relational database is a collection of relations or
   two-dimensional tables.

                               Database




                 Table Name: EMP        Table Name: DEPT
EMPNO   ENAME        JOB      DEPTNO   DEPTNO DNAME           LOC
 7839   KING      EXECUTIVE     10      10   ACCOUNTING       CLAYTON
7698     BLAKE    MANAGER      30       20   RESEARCH         CAULFIELD
7782    CLARK     MANAGER      10       30   SALES            PENINSULA
7566    JONES     MANAGER      20       40   OPERATIONS       GIPPSLAND



                                             CSE1720 Semester 1 2005 Database Week 7 / 28
                       Data Models




 Model of
 system
in client’s   Entity model of
   mind       client’s model

                                 Table model
                                of entity model                  Server




                                                            Tables on disk


                                     CSE1720 Semester 1 2005 Database Week 7 / 29
   Communicating with a RDBMS
          Using SQL

SQL statement
is entered          Statement is sent to
 SQL> SELECT loc         database
      FROM dept;


                                              Database



Data is displayed
LOC
-------------
CLAYTON
CAULFIELD
PENINSULA
GIPPSLAND




                              CSE1720 Semester 1 2005 Database Week 7 / 30
Relational Database Management
             System


                                 Server




                User tables                    Data
                                            dictionary


                     CSE1720 Semester 1 2005 Database Week 7 / 31
         A Primary Key - What’s that ?

• McFadden, Hoffer and Prescott define a Primary Key as :

  An attribute (or combination of attributes) which uniquely
  identifies each row in a relation. (table)

• Richard T. Watson has this to say:

  The primary key definition block specifies a set of column
  values comprising the primary key. Once a Primary Key is
  defined, the system enforces its uniqueness by checking
  that the Primary Key of any new row does not already exist
  in the table.
                                       CSE1720 Semester 1 2005 Database Week 7 / 32
              And - A Foreign Key ??

• McFadden, Hoffer and Prescott’s definition:
An attribute (or attributes) in a relation (table) of a database
  which serves as the Primary Key of another relation (table)
  in the same database.

• Richard T. Watson says:
An attribute (or attributes) that is a Primary Key in the same
  table, or another table. It is the method of recording relations
  in a relational database.

And, both the Primary and Foreign Key(s) should be drawn
  from the same Domain.
                                       CSE1720 Semester 1 2005 Database Week 7 / 33
                     2 Relations

EMPNUM       NAME      Date of Birth       DEPTNUM
   3        JONES      16-05-1956            605
   7        SMITH      23-09-1965            432
  11        ADAMS      11-08-1972            201
  15        NGUYEN     23-10-1964            314
  18        PHAN       16-11-1976            201

Relation (Table) Name : EMP
Relation Schema: EMP(empnum,name,date of birth,deptnum)

    DEPTNUM       DEPTNAME
        201       Production
        314       Finance
        432       Information Systems
        605       Administration
Relation (Table) Name : DEPT
Relation Schema: DEPT(deptnum, deptname)
                                       CSE1720 Semester 1 2005 Database Week 7 / 34
                   Relational DataBase
Data is represented in ROW and ATTRIBUTE form (matrix)
                     (tuple)  (column, field)

Collections of related data ---> TABLES (relations)

1 or more tables          ----> DATA BASE

ATTRIBUTES are generally static
ROWS are DYNAMIC and Time-Varying

The number of Attributes = DEGREE of a table
The number of Rows       = CARDINALITY of a table
                                      CSE1720 Semester 1 2005 Database Week 7 / 35
             Some RDB Considerations
• Data is held in tables
• No order of data in tables - row or attribute
• Concept of Foreign Key - Primary Key relationship
• Data Typing - including nulls
• Query Access - insert, update, delete, retrieval
• Indexing on candidate (and Primary) keys
• Integrity Constraints
            Attribute value ranges
            Referential Integrity
            Entity Integrity
            User Defined Integrity
• Set retention constraints
                                     CSE1720 Semester 1 2005 Database Week 7 / 36
             Some RDB Considerations
• Domain constraints
• User defined ‘Rules’ e.g. no examination result can be
  negative or > 100
• Recovery procedures. Used to restore a database after a
  failure
• No explicit linkages between tables
• Linking or embedding database operations in a procedural
  language (Cobol, C ..)
• Databases may be distributed across similar or different
  DBMS’s
• Security features


                                    CSE1720 Semester 1 2005 Database Week 7 / 37
                DBMS Components


                         users



                        Database
                        Management
                        System



            Program         Application                            Utility
Retrieval   Language                           Database
Update                      Programs                               Programs
            Interface




                                          CSE1720 Semester 1 2005 Database Week 7 / 38
         A Practical Development Scenario
In 198n , a Company decided to develop and introduce a
   Payroll system using database technology. It looked this
   this :



                      Payroll System



                        Payroll
                        Data




                                       CSE1720 Semester 1 2005 Database Week 7 / 39
             A Practical Development Scenario

The Company grew in numbers and expertise and decided
to introduce a Personnel System. The ‘new’ design was this



     Payroll System
     Payroll System                Personnel System




        Payroll                       Personnel
        Data                            Data




                                    CSE1720 Semester 1 2005 Database Week 7 / 40
         A Practical Development Scenario
In the next few years, these components were added
   Payroll System
   Payroll System                Personnel System




      Payroll                       Personnel
      Data                            Data




      Job History               Labor Distribution




      Employee                          Labor
     Tracing Data                    Analysis Data

                                 CSE1720 Semester 1 2005 Database Week 7 / 41
          Data Description Language

Used to describe data at the Database level
   Structure Attributes
 Schema : Complete description of the database using DDL
 SubSchema : Describes data in the database as it is
   ‘known’ to individual programs(processes) or users

The segment of logical data record(s) required
  commonly known as a VIEW




                                  CSE1720 Semester 1 2005 Database Week 7 / 42
            Data Manipulation Language
Language (commands and syntax) used to cause transfers of
  data from the Database and the Operating Environment and
  vice versa
Variety of Languages - Access, DB2, dBASEV, SQLServer,
                       VisualDataBase, DataBoss, Ingres,
                        Oracle, Informix, MySQL ....
Windows versions provide Icons and Menu options which are
  translated by the DBMS software to Database manipulation
  commands

   Typical commands: get, put, replace, seek, update,delete,
                     insert, drop, find, modify

                                     CSE1720 Semester 1 2005 Database Week 7 / 43
               Application Systems



users(ATM’s)
                      C programs
                                            D
users                                       B
                                            M
                                            S
users
                    Cobol


users
                     Access


                              CSE1720 Semester 1 2005 Database Week 7 / 44
              Advantages of DataBase
• Reduced Data Redundancy

• Data Integrity

• Data Independence

• Data Security

• Data Consistency

• Easier use of Data via DBMS Tools (Query languages,
  4GL's)

                                   CSE1720 Semester 1 2005 Database Week 7 / 45
             Disadvantages of Database
• Complexity

• Expense

• Vulnerability

• Size of - disk storage, processor memory

• Training Costs

• Compatibility

• Technology Lock In


                                     CSE1720 Semester 1 2005 Database Week 7 / 46
CSE1720 Semester 1 2005 Database Week 7 / 47
And Something Different




               CSE1720 Semester 1 2005 Database Week 7 / 48
                        Data Types
• Used to augment an attribute description and to provide a
  means of Integrity

 ’Normal’ data types are :
     Character (or text)
     Numeric - Integer, Decimal, Money, Float
           (in Access Long and Short Integer),
               Decimal with options of a number of ‘decimal’
                 places
     Date - Standard date format - various forms
     Logical - Yes/No True/False Exists/Does Not Exist


                                      CSE1720 Semester 1 2005 Database Week 7 / 49
                        Attribute Size
In many cases this is set by the Data Definition facilities

  e.g.. Date, Short Integer, Long Integer, Logical, Number

Others are set by the Designer:
     Number of Characters, Size of a ‘Decimal Number’

Access allows for    a Default value
                     Duplicates/ No Duplicates allowed
                     Primary Key nomination
                     Indexing
                     ‘Required’ Status of an Attribute
                                        CSE1720 Semester 1 2005 Database Week 7 / 50
                         Integrity Examples
Primary Key - Must have a value - not null
              - Must not be duplicated (unique)
Attribute values must exist: (or not)

student record:
student identity number - must exist, not duplicated, must be a
                           ‘valid’ number
student name             - must exist, may be duplicated
student course code     - must exist, must be a valid course
subject code            - if enrolled, must exist, code must be
                           a valid code
subject result         - dependent on time. May be null. If it
                          exists must be a valid grade and
                          mark          CSE1720 Semester 1 2005 Database Week 7 / 51
                   Integrity Examples
Questions:

1. Do the same ‘constraints’ exist in Excel (remember
   assignment 2 ?), Word ?

2. Should they ?

3. What are alternatives for embedding the same or similar
   controls

4. Are spreadsheets less ‘reliable’ than databases ?


                                      CSE1720 Semester 1 2005 Database Week 7 / 52
                           Queries
A Query acts on the base table or tables of a database and
  returns a subset of this data.

A Query normally returns a ‘restricted set’ of attributes (and
  their current values’ - this is the ‘User View’ of the database

A Query normally has some criteria
   e.g. salaries > $50,000
        outstandings > 30 days
        date (of some event =, > or < some designated date)

Criteria can be linked : event A OR event B,
                         event A AND event B
                                       CSE1720 Semester 1 2005 Database Week 7 / 53
                          Reports
A more formal output of data from base tables and in many
  cases produces high volume.

The design is formatted ;
Page numbering, Headings, Footers,
Page breaks, Page or Item totals and sub-totals

Reports can be ‘criteria based’ and include calculations
  (derived data)
Report content can be imported from another source, or
  exported to another target.
Reports can contain Exception Full Graphic Hypertext
  information and may be hard copy, electronic copy,
                                     CSE1720 Semester 1 2005 Database Week 7 / 54
                    A Puzzle




How can these dots be joined by 4 straight lines without
lifting a pencil (or pen) from the surface ?
                                CSE1720 Semester 1 2005 Database Week 7 / 55
    A Puzzle


        1

                  2
    3
                             No, that’s 5 lines
4


        5




               CSE1720 Semester 1 2005 Database Week 7 / 56
            A Puzzle


        1


5   4          2                  That’s worse - it’s
                                  6 lines
    6


              3




                       CSE1720 Semester 1 2005 Database Week 7 / 57
    A Puzzle


1



                            How about this ?




               CSE1720 Semester 1 2005 Database Week 7 / 58

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:3/21/2013
language:Latin
pages:58