A Strategy for Success
(or...Whose Data Is It, Anyway?)
“We have had tons of data for years. I have more data (as
manager of the Americas sales force) than my counterparts in
any company have, but very little information has been
available for me to use for decision making.”
John Williams, Corporate Vice President, Americas
Storage Technology Corporation
90% of all information processing organizations will be
pursuing a data warehouse strategy in the next three years.
A Data Warehouse is...
A repository of company data collected for on-line access and
maintained separate from the company‟s transaction-
processing operational systems.
A repository of strategic business data, usually in a relational
database management system, to which easy access is
available through point-and-click tools.
Data Warehousing Wrestles with
Moving various forms of data from legacy and on-line
Data Warehousing Requires
Preparing, conditioning and staging data so business users
can perform analyses that previously were either impossible,
or too expensive and time-consuming.
Data Warehouses Use Parallel-
Scaleable Systems for
Scalability - incrementally add processors and disk drives to
High availability - on-line backup and recovery, component
redundancy and failover
Parallel-scaleable RDBMSs - large databases with underlying
parallelization of queries and query traffic
Systems and network management - performance monitoring,
disk array and storage management tools, robust DBMS tools
and network configuration utilities
Six Keys to Successful Data
Scaleable, open architecture
Data access tools
Individual pools of data
Marts should adhere to common data definitions, structures
and access routines so that end users see the same face
wherever they look.
The process of getting operational data in shape for
warehousing in different marts should be centralized and done
with a consistent set of extraction and transformation tools.
Perform mapping, extracting and transforming of data; code
generation; creation and management of meta data
– Provide information about information!
“Data „Wearhouse‟ Gains”
– Considered 25 apparel items from 1000-item inventory
– Allocates merchandise to 678 stores based upon a mathematical
– Average store sells an equal number of black and ivory bras.
– Miami-area buyers buy ivory color by a margin of 10-to-1.
– Demand for specific size in the New York shops out sold other
sizes by 20-to-1.
– Pricing discount policy applied across the broad was not as
profitability because demand patterns showed that full pricing
was acceptable, yielding sales boosts of $5M.
NBA‟s Sixth Man
New York Knicks
– scored in 54% of their possessions against the Charlotte Hornets
– Adjusted lineup with Larry Johnson inside.
– Without that lineup they scored 43.5% of their possessions
against the Hornets.
– IBM‟s Advanced Scout data mining package.
– Software shifts through data downloaded from the NBA electronic
NBA‟s Sixth Man
– Alter the study of game films.
– Tom Sterner, asst coach, analyzes the Advanced Scout data
mining software and associated statistics.
– Saves about 2 hours time per review.
Data Mining Tools
Mine Your Own Business (MYOB).
DataMind Software, Inc.
TurnKey Data Mart 1.2 (Broadbase Information Systems)
Queries that are created by pointing and clicking.
Displays results in a Word or Excel file.
Allows organizations to confirm or refute theories about the
Detects unsuspected trends.
Has a “Why?” button that explains its conclusions.
Most data in one database: Sears, Roebuck and Co., 4.63 T
Most rows of data in one databse: Wal-Mart Stores, Inc, 50
Biggest workload: JC Penney Co., 784 concurrent
Most data in a group of databases: The Dialog Corp., 6.3 T
Large data warehouse architecture: Xerox Corp., multitier
architecture with 45 data warehouses and marts, including one
data mart that supports profit-and-loss decisions by business