Docstoc

Transforming OLTP Data to OLAP Data Warehouses

Document Sample
Transforming OLTP Data to OLAP Data Warehouses Powered By Docstoc
					Transforming OLTP Data to OLAP
Data Warehouses
The transformation of OLTP data so that it gives acceptable performance in an OLAP
system requires these processes:

Merge Data

You must be able to merge all the data related to specific items (products, customers,
employees) from multiple OLTP systems into a single OLAP system. The merge
process must resolve differences in encoding between the different OLTP systems.
For example, one system may assign an ID to each employee, and the other systems
have no employee IDs. The merge process must be able to match common employee
data from both systems, perhaps by comparing employee names and addresses. The
merge process must also be able to convert data stored using different data types in
each OLTP system to a single data type used in the OLAP system. You must also
select which columns in the OLTP system are not relevant to an OLAP system, and
exclude these columns from the merge process.

The systems providing input data for an OLAP system are not strictly limited to
traditional, centrally located OLTP systems. Valuable information may be stored in
various legacy locations, even in some cases including relatively small sources such
as Microsoft® Excel spreadsheets stored on a file share.

Scrub Data

Merging the OLTP data into a data warehouse gives you an opportunity to scrub data.
You may find that various OLTP systems spell items differently, or the merge process
may uncover previously unknown spelling errors. You may find other inconsistencies,
such as having different addresses for the same store, employee, or customer. These
inconsistencies have to be addressed before the data can be loaded into the data
warehouse for use by the OLAP system.

Aggregate Data

OLTP data records all transaction details. OLAP queries typically need summary data,
or data aggregated in some fashion. For example, a query to retrieve the monthly sales
totals for each product over the last year runs much faster if the database only has
summary rows showing the daily or hourly sales for each prod uct, than if the query
must scan every transaction detail record for the last year.

The degree to which you aggregate the data in a data warehouse depends on a number
of design factors, such as the speed requirements of your OLAP queries and the level
of granularity required for your analysis. For example, if you aggregate sales details
into daily summaries instead of hourly summaries, your OLAP queries would run
faster, but you could only do this if you had no need to analyze sales on an hourly
basis.
Organize Data in Cubes

Relational OLTP data is organized in a way that makes some analysis processing
difficult and time-consuming. When OLTP data is moved into a data warehouse, it
must be transformed into an organization that better supports decision support
analysis. The process of building a data warehouse involves reorganizing OLTP data
stored in relational tables into OLAP data stored in multidimensional cubes.

Transformati on Stages and Data Warehousing Components

The process of making data available through OLAP applications typically goes
through three phases:

   1. Extract the data from OLTP or legacy data sources into a staging area.

   2. Transform the data into a form usable in an OLAP system. This involves
      actions such as data scrubbing and aggregation.

   3. Load the data into a data warehouse or data mart.

The process of extracting the data from the OLTP and legacy data sources and
transforming it into the warehouse servers is called the ETL process, and is typically
run on a periodic basis, such as once a week or once a month.

Once the data is loaded into a data warehouse, an important part of an OLAP system
is to provide facilities for decision makers to access and analyze the data in the data
warehouses and data marts.

The illustration shows the general categories of components that OLAP systems use
to provide these services.
Data Sources

The OLTP databases and other legacy sources of data that contain the data that must
be transformed into the OLAP data in data warehouses and data marts.

Inte rmediate Data Stores

The combined data storage areas and processes that stage, cleanse, and transform the
OLTP data into useful OLAP data.

Warehouse Servers

Warehouse servers are the computers running the relational databases that contain the
data for data warehouses and data marts, and the servers that manage the OLAP data.

Business Intelligence

The sets of tools and applications that query the OLAP data and provide reports and
information to the enterprise decision makers.

Meta Data

Models the organization of data and applications in the different OLAP components.
Meta data describes objects such as tables in OLTP databases, cubes in data
warehouses and data marts, and also records which applications reference the various
pieces of data.



HOME

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:48
posted:5/27/2011
language:English
pages:3