The Data Warehouse

Document Sample
The Data Warehouse Powered By Docstoc
					        Chapter 11
The Data Warehouse

Database Systems:
Design, Implementation, and
Management, Seventh Edition, Rob
and Coronel                        1
In this chapter, you will learn:
 How operational data and decision support
 data differ
 What a data warehouse is, how data for it
 are prepared, and how it is implemented
 What star schemas are and how they are
 What data mining is and what role it plays
 in decision support
                                         2   2
In this chapter, you will learn
 What online analytical processing (OLAP)
 How SQL extensions are used to support
 OLAP-type data manipulations

                                        3   3
The Need for Data Analysis
 Managers must be able to track daily
 transactions to evaluate how the business
 is performing
 By tapping into operational database,
 management can develop strategies to
 meet organizational goals
 Data analysis can provide information
 about short-term tactical evaluations and
 strategies                                4   4
The Need for Data Analysis

                             5   5
The Need for Data Analysis

                             6   6
Decision Support Systems
  Decision support is methodology (or series of
  methodologies) designed to extract
  information from data and to use such
  information as a basis for decision making
  Decision support system (DSS)
    Arrangement of computerized tools used to assist
    managerial decision making within business
    Usually requires extensive data “massaging” to
    produce information
    Used at all levels within organization
    Often tailored to focus on specific business areas
    Provides ad hoc query tools to retrieve data and to
    display data in different formats                      7
Decision Support Systems
 Composed of following four main components:
   Data store component
     Basically a DSS database
   Data extraction and data filtering component
     Used to extract and validate data taken from operational
     database and external data sources
   End-user query tool
     Used to create queries that access database
   End-user presentation tool
     Used to organize and present data

                                                                8   8
Decision Support Systems

                           9   9
Operational Data vs.
Decision Support Data
 Operational Data
   Mostly stored in relational database
   Optimized to support transactions representing daily
 DSS Data
   Give tactical and strategic business meaning to
   operational data
   Differs from operational data in following three main
      Dimensionality                                  10 10
Operational Data vs.
Decision Support Data

                        11 11
Operational Data vs. Decision
Support Data (continued)

                                12 12
DSS Database Requirements
 A specialized DBMS tailored to provide
 fast answers to complex queries.
 Four main requirements:
   Database schema
   Data extraction and loading
   End-user analytical interface
   Database size

                                          13 13
DSS Database Requirements
 Database schema
  Must support complex data representations
  Must contain aggregated and summarized
  Queries must be able to extract
  multidimensional time slices

                                          14 14
DSS Database Requirements

                            15 15
DSS Database Requirements

                            16 16
DSS Database Requirements
 Data extraction
   Should allow batch and scheduled data
   Should support different data sources
     Flat files
     Hierarchical, network, and relational databases
     Multiple vendors
 Data filtering
   Must allow checking for inconsistent data
                                                       17 17
DSS Database Requirements
 End-user analytical interface
   One of most critical DSS DBMS components
   Permits user to navigate through data to
   simplify and accelerate decision-making

                                         18 18
DSS Database Requirements
 Database size
   In 2005, Wal-Mart had 260 terabytes of data
   in its data warehouses
   DBMS must support very large databases

                                             19 19
The Data Warehouse
 Integrated, subject-oriented, time-variant,
 nonvolatile collection of data that provides
 support for decision making
 Usually a read-only database optimized for
 data analysis and query processing
 Requires time, money, and considerable
 managerial effort to create

                                          20 20
The Data Warehouse

                     21 21
The Data Warehouse

                     22 22
The Data Warehouse
 Data mart
   Small, single-subject data warehouse subset
   Each is more manageable data set than data
   Provides decision support to small group of
   Typically lower cost and lower implementation
   time than data warehouse

                                             23 23
DSS Architectural Styles
 Several are available
 Provide advanced decision support
 Some capable of providing access to
 multidimensional data analysis

                                       24 24
DSS Architectural Styles

                           25 25
Twelve Rules that Define
a Data Warehouse
 Data warehouse and operational environments
 are separated
 Data warehouse data are integrated
 Data warehouse contains historical data over
 long time horizon
 Data warehouse data are snapshot data
 captured at given point in time
 Data warehouse data are subject oriented

                                            26 26
Twelve Rules that Define
a Data Warehouse (continued)
 Data warehouse data are mainly read-only with periodic
 batch updates from operational data
   No online updates allowed
 Data warehouse development life cycle differs from
 classical systems development
 Data warehouse contains data with several levels of
 detail: current detail data, old detail data, lightly
 summarized data, and highly summarized data
 Data warehouse environment is characterized by read-
 only transactions to very large data sets

                                                    27 27
Twelve Rules that Define
a Data Warehouse (continued)
 Data warehouse environment has system that
 traces data sources, transformations, and
 Data warehouse’s metadata are critical
 component of this environment
 Data warehouse contains chargeback
 mechanism for resource usage that enforces
 optimal use of data by end users

                                          28 28
Online Analytical Processing
 Advanced data analysis environment that
 supports decision making, business
 modeling, and operations research
 OLAP systems share four main
   Use multidimensional data analysis techniques
   Provide advanced database support
   Provide easy-to-use end-user interfaces
   Support client/server architecture
                                               29 29
Multidimensional Data Analysis
 Data are processed and viewed as part of
 a multidimensional structure
 Particularly attractive to business decision

                                           30 30
Multidimensional Data Analysis
Techniques (continued)
 Augmented by following functions:
   Advanced data presentation functions
   Advanced data aggregation, consolidation
   and classification functions
   Advanced computational functions
   Advanced data modeling functions

                                              31 31

Shared By: