Data Warehouse

Document Sample
Data Warehouse Powered By Docstoc
					  Data Warehouses

A Strategy for Success

(or...Whose Data Is It, Anyway?)

   The Problem

“We have had tons of data for years. I have more data (as
  manager of the Americas sales force) than my counterparts in
  any company have, but very little information has been
  available for me to use for decision making.”

   John Williams, Corporate Vice President, Americas
   Storage Technology Corporation

     The Outlook

   90% of all information processing organizations will be
    pursuing a data warehouse strategy in the next three years.

     A Data Warehouse is...

   A repository of company data collected for on-line access and
    maintained separate from the company‟s transaction-
    processing operational systems.
   A repository of strategic business data, usually in a relational
    database management system, to which easy access is
    available through point-and-click tools.

     Data Warehousing Wrestles with

   Moving various forms of data from legacy and on-line
    teleprocessing system.

     Data Warehousing Requires

   Preparing, conditioning and staging data so business users
    can perform analyses that previously were either impossible,
    or too expensive and time-consuming.

     Data Warehouses Use Parallel-
     Scaleable Systems for
   Scalability - incrementally add processors and disk drives to
    meet demand
   High availability - on-line backup and recovery, component
    redundancy and failover
   Parallel-scaleable RDBMSs - large databases with underlying
    parallelization of queries and query traffic
   Systems and network management - performance monitoring,
    disk array and storage management tools, robust DBMS tools
    and network configuration utilities

     Six Keys to Successful Data
   Data modeling
   Warehouse management
   Scaleable RDBMS
   Scaleable, open architecture
   Data access tools
   Consulting services

     Data Marts

   Individual pools of data
   Marts should adhere to common data definitions, structures
    and access routines so that end users see the same face
    wherever they look.
   The process of getting operational data in shape for
    warehousing in different marts should be centralized and done
    with a consistent set of extraction and transformation tools.

     Warehouse Management

   Perform mapping, extracting and transforming of data; code
    generation; creation and management of meta data
    –   Provide information about information!

     “Data „Wearhouse‟ Gains”

   Victoria‟s Secret
    – Considered 25 apparel items from 1000-item inventory
    – Allocates merchandise to 678 stores based upon a mathematical
      “store average”
    – Average store sells an equal number of black and ivory bras.
    – Miami-area buyers buy ivory color by a margin of 10-to-1.
    – Demand for specific size in the New York shops out sold other
      sizes by 20-to-1.
    – Pricing discount policy applied across the broad was not as
      profitability because demand patterns showed that full pricing
      was acceptable, yielding sales boosts of $5M.

     NBA‟s Sixth Man

   New York Knicks
    – scored in 54% of their possessions against the Charlotte Hornets
      last year.
    – Adjusted lineup with Larry Johnson inside.
    – Without that lineup they scored 43.5% of their possessions
      against the Hornets.
    – IBM‟s Advanced Scout data mining package.
    – Software shifts through data downloaded from the NBA electronic
      bulletin board.

     NBA‟s Sixth Man

   Orlando Magic
    – Alter the study of game films.
    – Tom Sterner, asst coach, analyzes the Advanced Scout data
      mining software and associated statistics.
    – Saves about 2 hours time per review.

     Data Mining Tools

   Mine Your Own Business (MYOB).
   DataMind Software, Inc.
   TurnKey Data Mart 1.2 (Broadbase Information Systems)
   Queries that are created by pointing and clicking.
   Displays results in a Word or Excel file.
   Allows organizations to confirm or refute theories about the
    data collected.
   Detects unsuspected trends.
   Has a “Why?” button that explains its conclusions.

     Warehousing “Bigfoots”

   Most data in one database: Sears, Roebuck and Co., 4.63 T
   Most rows of data in one databse: Wal-Mart Stores, Inc, 50
   Biggest workload: JC Penney Co., 784 concurrent
    query/maintenance operations
   Most data in a group of databases: The Dialog Corp., 6.3 T
   Large data warehouse architecture: Xerox Corp., multitier
    architecture with 45 data warehouses and marts, including one
    data mart that supports profit-and-loss decisions by business


Shared By: