PPT Slide Overnight Position

Document Sample
PPT Slide  Overnight Position Powered By Docstoc
					ICS321 / IBM205 Management
    Information Systems
         Dr. Ken Cosh
   Databases: Foundations of
     Business Intelligence
• Last week we looked at;
  – Hardware
    • Input, Output devices, Processors, Memory
  – Types of computer
    • Mainframe, Midrange, PC‟s
  – The emergence of Client Server relationships
    • Thin Clients, Fat Clients, P2P
  – Networks
    • Centralised, Decentralised, Distributed
         This Weeks Topics
• The „Data‟ Resource
  – Organising Data
  – Databases
 The Importance of Good Data
• Garbage In, Garbage Out;
  – Bad data will result in bad information (hence bad
• An effective information system depends not
  only on having good data, but on storing,
  organising and accessing the data.
• Challenges to this include;
  – Organisational Obstacles
  – Data Integration and Quality
      Organisational Obstacles
• Implementing new data models requires re-
  examining the role of data within an
   – Who has access to what data, and when?
   – Changing the allocation (or sharing) of data can
     impact on current power relationships, and so is often
     met by political resistance.
• Traditionally data is stored in file format, with
  each department having a selection of files.
  Modern approaches introduce databases and
  DBMS that can be shared across multiple
  Data Integration and Quality
• When moving data from a file format to a
  database (or sharing data from one source
  across multiple systems), converting data
  formats can be costly, and risks losing
  important information.
        Traditional File Format
• Bit (binary digit) – smallest unit a computer handles.
• Byte – Group of bits which can represent a character
• Field – Group of bytes which can represent a word
• Record – Group of related fields (such as name, course,
  grade, date)
• File – Group of records of the same type (such as a
  group of students records)
• Database – Group of related files (such as grouping the
  student‟s personal history, financial records)
               More terms
• Entity – a person, place, thing or event for
  which data is stored about.
• Attribute – Any characteristic or quality
  describing an entity, for example „age‟.
• Key Field – Every record in a file, should
  contain at least one key field, which
  uniquely identifies instances of that record,
  so that they can be retrieved, updated or
  sorted, for example „order number‟.
           Key Terms
Database     Course File History File Finance File

              NAME             COUR     DATE    GRADE
  File        John Stewart     ICS321   2005    C+
              Sarah Smith      ICS321   2005    B
              Ken Cosh         ICS321   2005    A+

              NAME             COUR DATE        GRADE
Record        John Stewart     ICS321 2005      C+

 Field        John Stewart (NAME Field)

  Byte        01001010 (Letter J in ASCII)

  Bit         0
     So What‟s the Problem?
• Remember Systems within systems
  (subsystems), interfacing systems and
  adaptive systems?
  – Each system tends to grow and adapt
  – Functional units develop systems isolated
    from other units.
  – Each functional unit develops many
    databases; personnel has personnel, payroll,
    medical insurance, pensions, mailing file….
• Data Redundancy and Confusion
  – Duplicate Data in multiple data files.
  – The same data can have different names,
    different meanings, different related data in
    different places.
  – The same name might be used for different
    data in different places.
  – Database confusion makes implementing a
    SCM, CRM or Enterprise wide system difficult.
                   Problems 2
• Program-Data Dependence
   – There is a tight relationship between the data in files
     and the programs using them.
   – Any changes to the data, results in necessary
     changes to the programs that use the data.
   – Maintaining data becomes costly.
• Lack of Flexibility
   – Scheduled reports can easily be generated from the
   – Ad Hoc reports however are costly to generate.
     While the information is somewhere in the system
     getting it out is tricky.
                   Problems 3
• Poor Security
  – Or poor control.
  – There is now a lot of data in a lot of databases
    throughout the organisation. It is difficult to control or
    manage the data – who is accessing what data?
• Lack of Data Sharing & Availability
  – With poor control over data, its difficult to share data
    between functions.
  – Accounts might benefit from some data that
    manufacturing has, etc.
• “Group of related files (such as grouping
  the student’s personal history, financial
• Ideally a database is a group of related
  data which can be used to serve many
  different applications, by centralising data
  and minimising redundancy.
      Data Management
   Address                Personnel
     ID                  Applications
 Hours Worked
   Pay Rate                Payroll
     Tax                 Applications
  Gross Pay

 Life Insurance
Golf Membership           Benefits
   Healthcare            Applications
• The DBMS sits between the actual data and the
  applications which use the data.
• This saves the user from needing to understand
  the actual physical way the data is stored,
  instead presenting a logical view of it.
• The user doesn‟t need to know the data
  definition language, but instead could use a data
  manipulation language such as SQL.
• In reality often the manipulation language is
  hidden within an application.
                                 Creating & Changing the
                                 logical structure of a
             Data Definition

                                 Querying & making
           Data Manipulation     changes to the information

                                  Menus, data entry
               Application        screens, reports and
               Generation         application software

                                  Who can see what
           Data Administration    information; methods
                                  for backup and
         Hierarchical Database
ROOT                                 Employee

CHILD          Compensation     Job Assignment             Benefits

               Salary History

SECOND                     Pension
                                          Life Insurance          Health
CHILD                      History
          Hierarchical Data
• Suppose from the previous data structure,
  we wanted to access the salary history for
  all people with the job title “Assistant”,
  accessing that data would not be easy.
• While certain scheduled reports can be
  generated, ad hoc reports are not as
       Relational Databases
• Data is organised into tables, which could
  be visualised as a spreadsheet. In each
  table data is organised into rows / records
  (or tuples).
• Any piece of data from any table can be
  linked to any piece of data in another
  table, so long as they have a common
  data element (field).
        Designing Databases
• Designing Relational Databases normally
  begins from building an Entity Relationship
  Diagram (E-R Diagram).
  – An Entity is a person, place, thing, or event for
    which data is collected and maintained.
  – Deciding what data should be stored about
    each entity
  – Defining the relationships between entities.
     • A customer can have many order numbers.
     • A doctor treats many patients.
               E-R Diagrams

           1                  M
Customer            Has           Order No.

           1                  M
 Doctor            Treats          Patients

           1                  1
Manager            Leads          Department
• Process of creating small stable data structures
  from complex groups of data.
• Relationships between entities can be „one to
  one‟, „one to many‟ or „many to many‟.
• An E-R diagram can become very complex with
  many data elements to be stored for many
  entities with complex relationships.
• Normalisation attempts to break entities down
  into smaller entities, and tries to remove
  complicated „many to many‟ relationships.
         Designing Databases
• Another key factor in designing databases is
  designing how they will be distributed
  – Will there be one central database server with clients
    accessing the data?
     • Single point of Risk
     • Powerful and expensive server required.
  – Will there be distributed data base servers, in multiple
    physical locations?
  – If distributed how and when will the data be updated?
    (Batch process overnight?)
  – What happens if 2 replicated databases have
    conflicting updates?
• Structured Query Language
  – The commonest data manipulation language for
    relational databases.
  – Used to query (get information from) databases, and
    also to put information into databases.
• SELECT Part_Number, Supplier_Code FROM
  Part WHERE Unit_Price < 25.00;
  – This would return the all the part numbers and
    supplier codes from the database called „part‟ which
    cost less than 25.00.
• Suppose we want to know;
  – “By actual vs budgeted, how many size 8
    shoes in black did we sell last month in the
    southeast and southwest regions, compared
    to the same month over the past 5 years.”
• A complex request…
  – If we can build the query, it could effect the
    performance of the database system!
     • Especially if it is a live operating system.
         Trends in Databases
• Multidimensional Visualisations
  – Often managers want to see relationships
    between large amounts of data.
     • Sales of 4 different projects over a period of
       months versus projected sales.
  – Can implement a multidimensional
  – Or a multidimensional visualisation tool.
  – When further dimensions need to be added,
    virtual reality or embedded data can be used.
         Trends in Databases
• Data warehouse
  – A database of current and historical data which can
    be accessed by anyone within the organisation.
  – Backups are made of all the data from all applications
    and stored in the warehouse.
  – Data in the warehouse can not be changed though, it
    is simply a record of company history, which can be
    used to find trends etc.
• A data mart
  – A subsection of a data warehouse specifically
    targeted towards a certain group of people.
• Data mining
  – The process of extracting patterns and or rules from a
    data warehouse. Often used by marketing
    departments to profile customers.
          Object-Oriented DB
• Hierarchical and Relational databases assume
  that data is in character or numerical form.
• How about databases that store data which can‟t
  easily be represented in files and tables (such
  as graphics, sounds, java applets or any other
• O-O databases are designed to deal with these
  diverse data types, however they tend to be a lot
  slower than relational databases.

Shared By:
Description: PPT Slide Overnight Position