Docstoc

ERP Data Warehouse - White paper

Document Sample
ERP Data Warehouse - White paper Powered By Docstoc
					      ERP Data Warehouse
Architectures, Tools & technologies




                 by




         Wipro Technologies
             January 2002
ERP Data Warehouse


                                            Table of Contents
Table of Contents ............................................................................................................ 2
1 Executive Summary ................................................................................................. 3
2 Introduction .............................................................................................................. 5
3 Technical Challenges Associated with ERP Data warehousing ................................ 5
4 Desired features of the ERP Data Warehouse ......................................................... 6
5 Architectural Choices ............................................................................................... 6
6 Tools & Technology Available .................................................................................. 8
  6.1    Packaged Solution from ERP vendors .............................................................. 8
    6.1.1    SAP Business Information Warehouse....................................................... 8
  6.2    Extraction Tools ................................................................................................ 8
    6.2.1    ActaWorks from Acta ................................................................................. 8
    6.2.2    Data stage from Ascential .......................................................................... 9
    6.2.3    PowerCenter from Informatica ................................................................. 11
7 Conclusion ............................................................................................................. 12
8 Appendix A............................................................................................................. 13




Wipro Confidential                                                                                           Page 2 of 37
ERP Data Warehouse


1 Executive Summary
ERP applications have come into existence with a great promise of providing an
integrated applications environment that addresses all the issues surrounding
uncontrolled growth of stove pipe IS applications and serving full enterprise needs.

After implementing expensive ERP packages, organizations as well as product vendors
realized that although these solutions streamlined operational processes and IS
applications, it was extremely difficult to serve the information needs of management. As
a result organizations had to implement data warehouses for their decision support and
business intelligence needs.

There are 3 options available for the organizations for implementing the data warehouse.

ERP-centric Data Warehouse: Data Warehouse is implemented using ERP vendor’s
data warehousing package such as SAP Business Information Warehouse or
PeopleSoft Enterprise Warehouse.

Due to proprietary nature of these packages, this option is recommended only when
more than 80% of the data in the data warehouse come from the same vendor’s OLTP
systems. Otherwise data integration and customization cost may be more than the
benefits of the well-integrated application environment.

Two Independent Data Warehouses: One Data warehouse is built with non-ERP
source data and the other is built within the ERP environment with ERP source data.

This option does not provide true enterprise or cross-functional view and can result in
multiple versions of truth. It also involves the burden of maintenance of two
environments resulting in overheads in terms of cost, manpower, diverse skill set and
also creates confusion among the business users.

Custom Build Data Warehouse: This is built outside ERP environment using best of
breed tools and technologies.

This is a highly flexible solution and enables single version of truth, and can grow
incrementally as organizational information needs grow. It is also highly scalable. But it
takes slightly longer time to implement and more development effort. This option is
recommended for cross-functional, high-performance, high volume, multi-dimensional
analytical environment with large user base.

Detailed advantages and disadvantages of each of these options are provided in
section 5.




Wipro Confidential                                                          Page 3 of 37
ERP Data Warehouse


ETL tools for extraction of data from SAP R/3 and loading into SAP BW:

ActaWorks from Acta:
ActaWorks is tightly integrated with SAP R/3 and works seamlessly with SAP R/3 as well
as BIW. It can also extract data from Non SAP R/3 data sources as well. It is becoming
popular among the BIW installations where SAP R/3 is the primary source. It has
features to extract incremental changes from SAP R/3.

Data Stage from Ascential:
Ascential’s Data stage is also one of the leading ETL tools. SAP is a reseller of Data
Stage and DataStage load pack for SAP BW. These tools are integrated into mySAP
business intelligence framework.

PowerCenter from Informatica:
Informatica PowerCenter is a strong ETL tool. It has separate plug-ins (PowerConnect)
for SAP R/3, Siebel, and PeopleSoft etc. Hence, it can extract the data from SAP R/3,
other ERP and Legacy systems. It could be a better choice when the majority of the data
comes from non-SAP legacy sources.

All the 3 products are SAP certified. However, ActaWorks was the first product to be
developed that is well integrated with SAP R/3 and popular among SAP R/3 users. Later
on SAP has become reseller for Data Stage product and integrated in its mySAP BI
platform.

Detailed comparison of these 3 ETL tools is provided in the Appendix A.




Wipro Confidential                                                        Page 4 of 37
ERP Data Warehouse


2 Introduction
Operational systems have been streamlined by deploying packaged enterprise resource
planning (ERP) applications. These packages replace legacy and homegrown systems
that are not well integrated. Traditionally, ERP packages have automated back-office
operations, such as finance, human resources, and manufacturing. Now there are
packages for front-office operations, such as sales, marketing, and customer service.

However, ERP systems cannot address decision-support requirements for several
reasons:
 ERP applications are designed to process large volumes of simple requests
    Larger queries take a long time for processing and need more resources
   ERP databases contain thousands of small tables that eliminate data redundancies
     It is easy to find and update a single data item, but querying is difficult
   ERP databases are very difficult to access, query, and navigate
     Some ERP systems store data in proprietary formats, making it difficult to access
     Finding the right entity within thousands of tables is a formidable barrier
   ERP system does not satisfy all the operational requirements of an enterprise.
    Similarly not all the modules of an ERP package meet the requirements of an
    enterprise, resulting in the implementation of part of the ERP package or multiple
    ERP packages that may co-exist with other legacy applications

Therefore, there is a need to implement a data warehouse sourcing the data from the
ERP, CRM and legacy systems to serve the information needs of business users.

This paper outlines the technical issues involved, Desired features and architectural
options available for implementing the data warehouse under ERP and non-ERP
environments.

3 Technical Challenges Associated with ERP Data
  warehousing
Following are the technical issues involved in extracting the data from ERP sources.

       Proprietary nature of ERP systems’ programming environment and APIs
       The complex architectures of ERP systems, which embed business logic and
        processes
       The data schemas of ERP systems, which are complex and typically contain
        thousands (SAP has about 9,000 tables) of tables (often described with
        abbreviations)
       The use of non-standard storage formats
       Change data capture




Wipro Confidential                                                             Page 5 of 37
ERP Data Warehouse


4 Desired features of the ERP Data Warehouse
      ERP data warehousing requires an ETL infrastructure that will enable the
       extraction and integration of the data from multiple diverse platforms like legacy,
       CRM, sales force automation and external marketing data providers.
      Capturing changed data from the ERP applications and legacy application will be
       a challenge due to large volume of transactions, complex architecture and given
       little time window for extracting the data from ERP applications.
      Organizations require information and analysis in real time to facilitate important
       decisions. To achieve this ERP data warehouse required to extract and transform
       data from ERP applications in a near real-time manner.
      Meta Data management and reconciliation of inconsistent Meta data are biggest
       problems facing organizations with regard to their data warehousing applications.
      ERP data warehouse should support both the technical analyst and less
       technical general business users.
      ERP data warehouses are expected to store global data of an organization. This
       requires separation of reference data that changes over time and transactional
       data that is constant. Dimensional model with slowly changing dimensions
       concept can address this well.

5 Architectural Choices
Approaches for Implementing Data Warehouses with advantages and disadvantages:

ERP Centric Data Warehouse: Data Warehouse is built within the ERP environment
(DSS provided by ERP vendor) by pulling non-ERP source data also into DSS system
provided by the same ERP vendor.

This option is recommended when majority of the data warehouse data (more than 80%)
is sourced from ERP systems and business content for the required functional areas is
available in the DSS provided by ERP vendor. Otherwise integration & customization
effort can outweigh the benefits of tight integration.

Two independent Data Warehouses: One Data Warehouse is built with ERP data and
the other is built from ERP data sources. This is a natural growth as it technically easier
and politically right solution.

Custom Build Data Warehouse outside ERP environment: The Data Warehouse is
built using best of breed tools outside the ERP environment. This option requires the
data extraction from ERP sources that could prove costly. But with the advent of ETL
tools such as ActaWorks, Ascential, Informatica that can extract data from ERP
application layer, the issue is mitigated to some extent.




Wipro Confidential                                                           Page 6 of 37
ERP Data Warehouse



Following table elaborates on advantages and disadvantages of each of the above
options:
Option               Advantages                        Dis-advantages
ERP centric Data      Tight integration of             Not flexible
Warehouse              operational and decision         Considerable customization
                       support systems                   effort and requires 3rd part ETL
                      Easier to implement closed        tools to integrate non-ERP
                       feedback loop DW                  sources data
                      Industry best practices are      Integration of non-ERP data
                       made available in the form of     (organizational or external) into
                       business processes and            ERP environment is complex
                       standard reports                  due to proprietary interfaces
                                                         and limited business content
                                                        ERP vendors are traditionally
                                                         strong in OLTP, but not in DSS
                                                         applications
                                                        Not proven for high
                                                         performance, high volume multi-
                                                         dimensional analysis with large
                                                         user base
                                                        Not all the functionality may be
                                                         supported by any given ERP
                                                         vendor
                                                        Growth to real-time Data
                                                         Warehouse may not be possible
Two                   Easier to implement              No enterprise/cross functional
Independent            technically                       view
Data                  Politically natural solution     Higher maintenance and
Warehouses            Earlier investments on            sustenance costs
                       existing DW initiatives are      Prone to inconsistencies across
                       protected                         two data warehouses leading to
                                                         two versions of truth
                                                        Ambiguity among the user
                                                         community
Custom built          Flexible                         Data extraction from ERP OLTP
Data Warehouse        True enterprise wide single       systems is complex
outside ERP            version of truth can be          3rd party vendor tools need to
environment            attained                          keep up to date with changing
                      Easier to integrate external      ERP environment
                       data                             Longer time to implement
                      Scalability is not an issue
                      Open Architecture is
                       amenable to real-time Data
                       Warehouse refresh and
                       closed loop feedback




Wipro Confidential                                                          Page 7 of 37
ERP Data Warehouse



6 Tools & Technology Available
6.1   Packaged Solution from ERP vendors

6.1.1 SAP Business Information Warehouse

Since SAP announced its business information warehouse in 1998, it has gone thru
many transformations. Until version 2.1C, SAP BW has been primarily used for
operational reporting that was not possible within SAP R/3. It had several limitations
such as drill across, ODS structure and scalability. But version 2.1C (my SAP BI) seems
to have addressed these issues and it now offers a sound BI platform for SAP R/3 users.

SAP has tied up with Ascential to integrate its ETL tool Data stage as part of the BI
platform. With this it has overcome the weakness of transporting the non-ERP data into
its business warehouse.

On the UI end it still does not have a competing OLAP tool, though its partners OLAP
tool, such as Business Objects, Cognos, can be used for the same. Business Explorer
UI that comes with business warehouse is excel like and does not offer robust OLAP
functionality.

Business content is also still limited and does not match with its competitor’s offerings in
the packaged applications space such as those from Epiphany, Broadbase/EPM,
DecisionPoint Application, Hyperion, Gentia, NCR, SAS, and Alphablox etc.

6.2   Extraction Tools

6.2.1 ActaWorks from Acta

Acta was the first vendor to bring a product to market specifically tailored to support data
warehousing with ERP systems. Today Acta offers the most comprehensive data
warehousing and data integration products for use with ERP systems.

ActaWorks for SAP is designed to support tight integration with SAP ERP applications.
In addition to providing an intuitive GUI for mapping data from SAP and non –SAP
sources to data warehouse or data mart, ActaWorks extracts data via SAP R/3
application layer, allowing access to all SAP data and business logic. ActaWorks also
features a component that supports real-time updates and change-data capture for data
warehouses. Also Acta offers pre-packaged data marts or Rapid marts for use with Acta
Works to speed warehouse development.

ActaWorks for SAP consists of five key components: ActaWorks Designer, a Meta data
repository, ActaWorks Server, ActaWorks Integrator for SAP and ActaWorks
administrator.

ActaWorks designer is graphical tool for defining the data mappings, transformations
and control logic necessary for managing a complex multi step process for populating a
data warehouse. Designer allows users to define data mappings and transformation
rules using GUI modeled on SQL.


Wipro Confidential                                                            Page 8 of 37
ERP Data Warehouse



The data mappings and transformation rules specified with designer are stored in
ActaWorks Meta data repository. The repository also stores information describing the
schema for SAP and non-SAP data sources and the target data warehouse schema. To
facilitate the process of identifying the right information to extract, ActaLink provides
English language descriptions of both tables and columns.

The hub of the transformation process is ActaWorks Server, which performs complex
data transformations and integrates data from non-SAP sources with SAP data. The
server is designed to provide high throughput and uses in-memory transformations,
parallel pipelining.

To extract data from SAP, the ActaWorks Integrator for SAP automatically generates
optimized ABAP/4 code. This removes the need to write and maintain custom ABAP/4
code. The features of the integrator are:
     Populates Meta data repository with SAP logical view of the data.
     Translates ANSI SQL constructs specified in the designer into ABAP/4 support
       (OpenSQL)
     Automatically Generates ABAP/4 code extracting data
     Uses SAP administrative infrastructure by extracting data via SAP’s application
       server layer thereby providing access to all SAP data, including data stored in
       pool and cluster tables, and other SAP business logic.
     Automatically extracts the hierarchies from SAP

ActaWorks Administrator provides facilities for warehouse administrators to schedule
and monitor jobs.

To capture the changed transactions in the source (SAP) can be implemented using the
IDocs (Intermediate Document architecture). Idocs capture data when a transaction is
being processed. This is very effective means of capturing the data from SAP when
underlying tables do not contain date and time stamps. ActaWorks generates ABAP to
read staged Idoc data from header and detail.

ActaWorks supports real-time data transformation including receiving messages from
ERP systems or XML-based, e-commerce applications. “Real-time” means that
ActaWorks reacts to messages as they are sent, performing predefined operations to
respond appropriately. For real-time updates from the SAP it is required to install the
Acta RealTime Component. For real-time data extraction, ActaWorks Real-Time uses
SAP R/3 Application Link Enabling (ALE) technology and Intermediate Documents
(IDocs) to capture and process transactions. Idocs can be enriched with other R/3 or
non-R/3 data as you specify in the real-time data flow design.


6.2.2 Data stage from Ascential

Using DataStage XE, warehouse developers can take data from diverse sources and
complex data forms such as legacy data, B2B and web environments, as well as
enterprise applications such as SAP and Siebel. They can transform this data, load it
into a warehouse, data mart or business intelligence application for analysis. By
managing the Meta data, DataStage XE completely integrates Meta data with the most



Wipro Confidential                                                          Page 9 of 37
ERP Data Warehouse


commercially popular data modeling and data access tools. Finally, the quality
assurance component enables warehouse administrators to audit, monitor, and manage
the quality of the data as the warehouse expands and evolves.

Specifically, DataStage XE is an integrated set of software components consisting of:
    Quality Manager for data quality assurance critical for accurate business analysis
    MetaStage for Meta data integration in order to maintain consistent analytic
        interpretations as well as track changes to the data warehouse
    DataStage for data collection and integration from diverse sources for complete
        "snapshots" and data movement and transformation for system and end-user
        productivity
    DataStage XE/390 for extracting legacy data while using the power of the
        mainframe infrastructure

As part of DataStage XE, Quality Manager gives development teams and business
users the ability to audit, monitor, and certify data quality at key points throughout the
data integration lifecycle. Further they can identify a wide range of data quality problems
and business rule violations that can inhibit data migration efforts as well as generate
data quality metrics for projecting financial returns.

By improving the quality of the data going into DataStage transformations, organizations
also improve warehouse performance and the data quality of the resultant target data.
The end result is validated data and information for making smart business decisions
and a reliable, repeatable and accurate process for making sure information maintains
its superior quality over time.

A critical component of DataStage XE is MetaStage, Ascential’s solution for meta data
management across data warehouse environments. Most data warehouses and marts
are created using a wide variety of tools that cannot exchange Meta data. As a result,
business users are unable to understand and leverage enterprise data because the
contextual information, or Meta data, required is unavailable or unintelligible. Based on
patented technology, MetaStage offers broad support for sharing Meta data between
third-party data environments. MetaStage uses MetaBrokers to ensure the complete
exchange of all related meta data, regardless of source type.

DataStage is a client/server development tool for building and supporting data migration
applications. Ascential Software offers options such as XML Pack, Enterprise Application
Packs, and the MQ Series Plug-in. On the server side, DataStage has a transformation
engine that enables complex processing while providing ease of use, management
control and maximum performance. The DataStage client is a graphical tool with the
following major components: Manager, Designer, Director, and Administrator. The
DataStage Manager supports the import/export of meta data, as well as the central
control of shared transformation objects. The Designer is the tool that visually represents
the data transformation process with an intuitive easy-to-use graphical engine. The
Director, as its name implies, supports the scheduling and execution of completed
transformations, and the Administrator provides for housekeeping and security functions.
Data warehousing professionals use the DataStage client to interact with the DataStage
Server, the workhorse that processes the transformations and moves data at run-time.




Wipro Confidential                                                          Page 10 of 37
ERP Data Warehouse


Enterprise application (EA) systems provides critical data sources for business analysis.
DataStage XE provides full integration with leading enterprise applications including
SAP, Siebel, and PeopleSoft.

The DataStage Extract PACKs for SAP R/3, Siebel and PeopleSoft, and the DataStage
Load PACK for SAP BW enable warehouse developers to integrate this data with the
organization's other data sources. The DataStage Extract pack provides:

   1. Extensive transformation capabilities to manipulate SAP R/3 data and load it to
      new or existing data warehouse or data mart.
   2. Generates ABAP/4 SAP’s programming language. Automation of ABAP code
      shields developer from the complexity of manually writing ABAP code and more
      importantly reduces the development and maintenance costs
   3. Access to all SAP R/3 data including transparent, pool, view and cluster tables
      using unique feature –DataStage Meta data object browser. With over 15000
      SAP tables and its known complexity, the meta data object browser enables easy
      navigation through the info hierarchies before joining multiple R/3 tables –
      Simplifying the process
   4. Enables two methods of operation to optimize performance and resources:
      Generated ABAP code can be uploaded to the R/3 system via remote function
      call or for the warehouse developers who don’t have direct access to the R/3
      System, R/3 script can be moved manually via FTP and be imported by an R/3
      administrator. Job scheduling can be controlled either from the DataStage
      Director or natively from the SAP scheduling services.
   5. Performs complex transformations easily with drag-and-drop operations using
      DataStage designers graphical mapping tool
   6. Utilizes SAP’s RFC library and iDocs; two of the primary data interchange
      mechanisms for access for SAP R/3, thus conforming to SAP interfacing
      standards.
   7. Another key function is the ability to capture incremental changes and produce
      event-triggered updates with SAP’s IDoc (Intermediate Documents) functionality.
      DataStage’s IDoc extract interface retrieves IDoc meta data and automatically
      translates the segment fields into DataStage achieving real-time SAP data
      integration

6.2.3 PowerCenter from Informatica

PowerCenter from Informatica is one of the popular and powerful tool in the ETL space.
It offers seamless integration with wide data sources including the ERP, mainframe and
relational systems as well as e-commerce and legacy applications. Informatics’
PowerConnect for PeopleSoft and PowerConnect for SAP can directly extract and
integrate the data from SAP R/3 and people soft applications, as well as other formats.
PowerConnect modules are component-based offering that complement and extend the
functionality of Informatica core data warehouse development platform – the
PowerCenter.

PowerConnect for SAP provides Informatica PowerMart/PowerCenter users with native,
high-speed data extraction from SAP R/3 systems, enabling full access to all SAP R/3
tables and SAP R/3 Info hierarchies. PowerConnect for SAP extracts data from SAP
using ABAP 4, SAP’s proprietary 4GL. Using powerconnect, users can access all SAP
R/3 Tables, including transparent, pool and cluster tables. This allows full access to all


Wipro Confidential                                                         Page 11 of 37
ERP Data Warehouse


data residing in SAP R/3’s application layer. Once extracted, SAP data is delivered to
the PowerCenter server, which transforms the data for delivery to target data
warehouse, data marts, or other analytic applications.

PowerConnect for SAP lets you customize the R/3 extraction routines for load
processing. You can choose to stage the data in an intermediary file or stream it directly
into the PowerCenter Server. In addition when accessing data in R/3 PowerConnect only
performs the actual extraction processes on the R/3 system. Transformation and load
processing occur within the PowerCenter helping to minimize the load on the R/3
environment.

7 Conclusion
Companies have been struggling for some time now to build data warehouses and data
marts that will allow their users to perform better and easier analysis of SAP data. Due to
the complexity of the SAP R/3 system and a lack of good data warehousing products
specifically designed to handle SAP data, companies were forced to write their own
custom extraction programs in ABAP/4.
This however is changing and good number vendors, recognizing the opportunity, have
introduced ETL products that can assist in extracting and integrating SAP and non-SAP
data and moving it into the warehouse.

SAP is seriously pursuing its efforts to provide a scalable BI platform by
upgrading its Business Information Warehouse. It is enhancing the business
content in each of the new versions, but still lacks the capabilities provided by
competing packaged solutions. It has also integrated DataStage (an ETL tool) to
integrate non-SAP data also into BW platform.

Meta group predicts that by 2005, SAP BW can become a dominant player in the
packaged data warehouse players catering to enterprise level information needs
of SAP R/3 users. It may not achieve the same success among non SAP R/3
users.




Wipro Confidential                                                          Page 12 of 37
ERP Data Warehouse




8 Appendix A

                                                                                           Ascential Data Stage
   Category          Criteria    Informatica PowerCenter              Acta Works
                                                                                                   XE

  Version---->                              5.0                            5.0                       5.1
  Architecture   Architecutre   Hub and Spoke Architecture      Open Client Server        Client Server
                                                                Platform facilitate the   Architecture
                                                                sharing of Meta Data



                 Scalable and   Highly scalable and extensible Scalable, Flexible         Highly scalable Scales up
                 Extensible     technology. Scale up as the    Technology.                w.r.t the hardware and
                 Technology     data and load grows. Scales up                            software
                                w.r.t the hardware and
                                software




                 Client Platform Windows 2000/NT/98             Windows 98/NT/2000, Windows 95/NT/2000
                                                                OS/2
                 Server         Sun Solaris, AIX, HP-UNIX,      Windows NT/2000, HP- Windows NT ( Intel and
                 Platforms      Windows NT/2000                 Unix, Solaris, AIX   Alpha Platforms ), UNIX
                                                                                     AIX, HP-UX, Sun Solaris,
                                                                                     COMPAQ Tru64. Data
                                                                                     Stage XE 390 works on
                                                                                     OS/390 platform.




Wipro Confidential                                                               Page 13 of 37
ERP Data Warehouse



                Which DBMS       For Extraction: DB/2        Oracle, Informix,           QSAM: Sequential flat
                are supported    DB/2 /400,Flat              Microsoft SQL Server,       files ISAM: VSAM:
                for extraction   Files,IMS,Informix, MS SQL  Sybase, DB2 UDB,            KSDS, RSDS, ESDS -
                and loading      Server,                     ODBC-compliant              support GROUPS, multi-
                                 MS Access, Oracle,          databases, and flat files   level arrays, REDEFINES,
                                 Sybase,UDB,VSAM,ODBC,Others                             and all PICTURE clauses.
                                                                                         DB2, Adabas, Oracle OCI
                                 Targets: Informix                                       ( For releases 7 and 8 ) ,
                                 DB/2 /400,MS SQL Server, MS                             Sybase Open Client ,
                                 Access,,Oracle, PeopleSoft                              Informix CLI , OLE/DB
                                 Enterprise                                              for Microsoft SQL Server
                                 Performance                                             7, ODBC.
                                 Management(EPM),SAP®
                                 Business Information
                                 Warehouse
                                 (BW),Sybase,UDB,Flat
                                 Files,Others
                Support for                                                              DataStage XE provides
                ERP Sources                                                              full integration with
                                                                                         leading enterprise
                                                                                         applications including
                                                                                         SAP, Siebel, and
                                                                                         PeopleSoft. The
                                                                                         DataStage Extract PACKs
                                                                                         for SAP R/3, Siebel and
                                                                                         PeopleSoft, and the
                                                                                         DataStage Load PACK for
                                                                                         SAP BW enable
                                                                                         warehouse developers to
                                                                                         integrate this data with
                                                                                         the organization's other
                                                                                         data sources




Wipro Confidential                                                            Page 14 of 37
ERP Data Warehouse



                Code          Supports development of        All the objects in the      Permits the reuse of
                Reusability   Mapplets which acts as library object library can be re-   existing code through
                capability    between Mappings and also can useable. An object can       APIs thereby eliminating
                within the    make transformations shareable be data flow, workflow,     redundancy and retesting
                product       across Mappings.               job etc.                    of established business
                                                                                         rules




                Parallelism   Supports parallelism, one can   Supports Parallelism, if it Automatically distributes
                              run multiple mapping session    is running on a multi       independent job flows
                              on the same server.             prcessor computer. It       across multiple CPU
                                                              takes full advantage of processes.This feature
                                                              the Hardware                ensures the best use of
                                                              Architecture.               available resources and
                                                                                          speeds up overall
                                                                                          processing time for the
                                                                                          application.




Wipro Confidential                                                            Page 15 of 37
ERP Data Warehouse



                Code          PowerCenter does not generate     Does generate Code, but    Only Datastage
                Generator     code,all the mappings             the Data Flow or Job       XE/390 version
                              developed will be inform of GUI   Flow defined can be        automatically generates
                              interface.                        converted to code to       and optimizes native
                                                                check with Acta Support.   COBOL code and JCL
                                                                                           scripts that run on the
                                                                                           OS/390 mainframe.




                Data           PowerCenter is based on Hub & Transformation is       Transformation is engine
                Transformation Spoke architecture and has    engine based and relies based - column-to-
                Method (Engineinbuilt Transformation engine. on the server.          column mappings
                Based ?)




Wipro Confidential                                                            Page 16 of 37
ERP Data Warehouse



                Building &         Aggregation can be built using Aggrigation thru Read to Enhances performance
                Managing           the built in transformation    use Transformation       and reduces I/O with its
                Aggregates         provided.                      function                 built-in sorting and
                                                                                           aggregation capabilities.
                                                                                           The Sort and
                                                                                           Aggregation stages of
                                                                                           DataStage work directly
                                                                                           on rows as they pass
                                                                                           through the engine
                                                                                           rather than depending on
                                                                                           SQL and intermediate
                                                                                           tables.
                Support for        Supports most of the industry Supports most of the      It supports most of the
                various data       standard data types. This also industry standard data industry standard data
                types              depends on the kind of source types                     types. It supports XML
                                   system being used.                                      also.




                Data Quality                                                               Through Quality Manager
                Check                                                                      it is possible to audit,
                functionality or                                                           monitor, and certify data
                feature                                                                    quality at key points
                                                                                           throughout the data
                                                                                           integration lifecycle.




Wipro Confidential                                                               Page 17 of 37
ERP Data Warehouse



                Debugging and Does not a separate debugging Error Correction can be Helps developers verify
                logging       Tool. The workaround is by     done for each job       their code with a built-in
                features      setting the "verbose" property workflow, data flow and debugger thereby
                              on each transformation. By this even object.              increasing application
                              informatica will create log files                         reliability as well as
                              in the server, which can be                               reducing the amount of
                              used for further analysis.                                time developers spend
                                                                                        fixing errors and bugs.
                                                                                        Supports debugging on
                                                                                        row-by-row basis using
                                                                                        break points. DataStage
                                                                                        immediately detects and
                                                                                        corrects errors in logic or
                                                                                        unexpected legacy data
                                                                                        values using this. Highly
                                                                                        useful for complex
                                                                                        transformation, date
                                                                                        conversions etc.
                Exception     Throws out the error records or Support exception         Supports exception
                Handling      rejected records into a log file handling no extra effort handling.
                                                               required.




Wipro Confidential                                                             Page 18 of 37
ERP Data Warehouse



                How Tool       Through log files stored in the   Through Log files         Developers can closely
                Provides       server                                                      observe the running jobs
                information                                                                in the Monitor Window to
                about                                                                      provide run-time
                exception                                                                  feedback on user-
                                                                                           selected intervals.The
                                                                                           powerful process viewer
                                                                                           estimates rows-per-
                                                                                           second and allows
                                                                                           developers to pinpoint
                                                                                           possible bottle-necks
                                                                                           and/or points of failure.
                                                                                           Using the Director, the
                                                                                           developer can browse
                                                                                           detailed log records as
                                                                                           each step of a job
                                                                                           completes. These date
                                                                                           and time stamped log
                                                                                           records include notes
                                                                                           reported by the
                                                                                           DataStage Server as well
                                                                                           as messages returned by
                                                                                           the operating
                                                                                           environment or source
                                                                                           and target database
                                                                                           systems. DataStage
                                                                                           highlights log records
                                                                                           with colored icons (green
                                                                                           for informational, yellow
                                                                                           are warnings, red for
                                                                                           fatal)for easy
                                                                                           identification.
                Restarting an Support restarting of the          Restart is possible. Can Restart is possible. Can
                aborted ETL mappings                             restart from the point of restart from the point of
                process                                          failure.                  failure.




Wipro Confidential                                                              Page 19 of 37
ERP Data Warehouse



                Memory         128 MB/ 256 MB                     64 MB /128 MB           64 MB
                (Minimum/
                Recommended)
                requirement at
                client machine
                Memory         Depends on the kind of             64 MB /128 MB           Minimum 256 MB
                (Minimum/      application running, 128 MB /
                Recommended) 256 MB
                requirement at
                Server machine
                Repository     PowerCenter comes with good        Repository Backup can   Supports distributed
                Backup and     features for backup and            be taken by using       Repository - Remote
                Recovery       recovery of the repository. This   Reportistory Manager.   sites can subscribe to a
                               can done through Repository                                set of meta data objects
                               Manager.                                                   within the warehouse
                                                                                          application. These sites
                                                                                          are notified via email
                                                                                          when meta data changes
                                                                                          occur within their
                                                                                          subscription. DataStage
                                                                                          XE offers version control
                                                                                          such as table definitions,
                                                                                          transformation rules, and
                                                                                          source/target column
                                                                                          mappings within a 2-part
                                                                                          numbering scheme.




Wipro Confidential                                                                Page 20 of 37
ERP Data Warehouse



   Meta data    Metadata       Meta data is captured and       Automatically captures Stores all the meta data
   support      Capture        stored in the repository of the the meta data and stores in the Repository.
                               PowerCenter                     in the repository        Captures the Meta Data
                                                                                        Automatically using
                                                                                        component called 'Meta
                                                                                        Stage' . It also offers
                                                                                        broad support for sharing
                                                                                        meta data between third-
                                                                                        party data environments
                                                                                        using Metabrokers. It
                                                                                        maintains a complete
                                                                                        catalog of the
                                                                                        organization’s metadata,
                                                                                        including physical,
                                                                                        technical, business and
                                                                                        process meta data.
                Business View Business Meta data needs to      Not available. Only      DataStage XE provides
                meta data     documented while building the Technical Meta Data is warehouse developers
                              mappings. This data will be      stored.                  with a central hub that
                              stored in the meta data                                   manages meta data at
                              repository. Using the SQL                                 the tool-integration level.
                              commands it is possible to                                Remote sites can
                              query the meta data.                                      subscribe to a set of
                                                                                        meta data objects within
                                                                                        the warehouse
                                                                                        application.These sites
                                                                                        are notified via email
                                                                                        when meta data changes
                                                                                        occur within their
                                                                                        subscription.
                Meta data     Since meta data is stored in the Provides meta data       User level security
                security      repository of the product it is  security through         provided by DataStage
                              very well protected.             repository manager,      Administrator
                                                               needs userid and
                                                               password to login.




Wipro Confidential                                                             Page 21 of 37
ERP Data Warehouse



                Web              Does not have any web      BY using Access Server Yes , Supports Web
                Integration      integration                for Web administration. integration using Plugin
                support                                     Using this it is possible to API
                                                            control the whole loading
                                                            process from a remote
                                                            machine.
                Versioning    Supports versioning with the  Supports Versioning          DataStage XE offers
                Support       help of the repository and    through central              version control,which
                              allows one to define the      repository.                  saves the history of all
                              baseline.                                                  the ETL development.It
                                                                                         preserves application
                                                                                         components such as
                                                                                         table
                                                                                         definitions,transformation
                                                                                         rules,and source/target
                                                                                         column mappings within
                                                                                         a 2-part numbering
                                                                                         scheme.Developers can
                                                                                         review older rules and
                                                                                         optionally restore entire
                                                                                         releases that can then be
                                                                                         moved to distributed
                                                                                         locations.
                Metadata      Sharable through the Metadata Does not exchange the Has its version of the
                repository's  Exchange (MX2) API            metadata with other          Common Meta Model.
                compliance to                               application                  The meta data can be
                one of the                                                               shared using the
                industry meta                                                            MetaBroker.
                data standards




Wipro Confidential                                                             Page 22 of 37
ERP Data Warehouse



                Meta data     PowerCenter comes with the        Central repository          No tool currently
                views using   meta data reporting tool which provides meta data             available.The entire
                query tools   will help the users to access the viewing facility and also   history of the data can
                              meta data stored in the           repository tables can be    be derived and viewed
                              repository.One can view meta queries using SQL                using Data Lineage.
                              data using the query tools like statements.
                              SQL etc.


 Ease of setup Easy           The installation process        Easy to install only two      An industry standard
               installation   depends the platform on which components needs to             installation script
               procedure      being installed. Some times it installed.                     provided for each "
                              can run into rough weather due                                DataStage "Packages"
                              to various reasons. But most of                               helps in easier
                              the cases it is very easy to                                  installation and
                              install                                                       automated configuration.
                Ability to    It is possible to generate the  Possible to Generate the      Possible to create the
                generate Data target data mart schema similar Data mart Scehema.            data mart schema similar
                mart schema to source database.                                             to source
                similar to
                source
                database
                Support for    Supports Start Schema data      E-Caches provides ready- Does not support
                designing data model for target data mart      to-use data marts suites directly. But with data
                mart           design.                         with all the ETL facility integration capabilities of
                                                               defined.                  DataStage/DataStage
                                                                                         390 with DB2 Warehouse
                                                                                         Manager's data
                                                                                         warehouse generation
                                                                                         and management
                                                                                         capabilities it is possible
                                                                                         to design data
                                                                                         mart/warehouse.




Wipro Confidential                                                              Page 23 of 37
ERP Data Warehouse



                Importing data It is possible to import the data Does not support.     The MetaBroker for a
                models from models from different modelling                            particular tool represents
                modeling tools tools by using Plug in called MX.                       the meta data just as it is
                                                                                       expressed in the tool ’s
                                                                                       schema. It accomplishes
                                                                                       the exchange of meta
                                                                                       data between tools by
                                                                                       automatically
                                                                                       decomposing the meta
                                                                                       data concepts of one tool
                                                                                       into their atomic
                                                                                       elements via the
                                                                                       MetaHub and
                                                                                       recomposing those
                                                                                       elements to represent
                                                                                       the meta data concepts
                                                                                       from the perspective of
                                                                                       the receiving tool.In this
                                                                                       way all meta data and
                                                                                       their relationships in the
                                                                                       integrated suite are
                                                                                       captured and retained for
                                                                                       use by any of the tools.
                                                                                       Summarizing,
                                                                                       MetaBrokers facilitates
                                                                                       meta data exchange
                                                                                       between DataStage and
                                                                                       popular data modeling
                                                                                       and business intelligence
                                                                                       tools.




Wipro Confidential                                                             Page 24 of 37
ERP Data Warehouse



TransformationsFilter        Supports Filter transformation   Supports various types of Supports Filter
                                                              transformations:          transformation
                                                              Filtering, Merging, Key
                                                              Generation, Table
                                                              Comparison etc.




                Format       Support Format conversion and Format Conversion is       Supports format
                conversion   data type conversion.         possible,                  conversion such as date
                                                                                      & time display, numeric
                                                                                      representation, National
                                                                                      currency rules, Collating
                                                                                      sequences etc.
                Lookup       Suppors Lookup transformation Lookup funcitonlaity is Support lookup
                             very well.                    possible, three types of procedures, hashed
                                                           funcitonality, pre-cached, lookup tables to increase
                                                           cahche-on-demand, no- performance.
                                                           cache.




Wipro Confidential                                                           Page 25 of 37
ERP Data Warehouse



                Scope for user One can define user define       Possible to define         One can define user
                defined fields variables but there is no such   variable with scope        define variables
                               thing called scope.              global, local and also can
                                                                pass parameter values
                                                                b/w various projects.




                Joins          Supports most of the join types. Supports all types of    Supports most of the join
                                                                joins.                   types using join
                                                                                         transformation



                Support for    Supports external procedures, it Possible to call COM     Built into DataStage are
                external       is possible to call stored       objects, DLL functions   several features
                procedures     procedures through mappings. etc.                         exclusively designed to
                                                                                         support the packaging
                                                                                         and deployment of
                                                                                         completed data migration
                                                                                         applications.




Wipro Confidential                                                             Page 26 of 37
ERP Data Warehouse



 Management Scheduling         Supports good scheduling       Good Scheduler with in    Good graphical
            feature            feature and it is possible to  the tool with Work flow   scheduling and
                               schedule the job/session using mechanism, calendar.      Monitoring feature
                               Server Manager. With limited                             provided by the
                               work-flow mechanism.                                     datastage component
                                                                                        called Data Director. It
                                                                                        can also generate CRON
                                                                                        scripts to schedule from
                                                                                        Unix. With DataStage
                                                                                        Job Control API and
                                                                                        Command Language
                                                                                        interface provided, any
                                                                                        remote C program or
                                                                                        command shell can be
                                                                                        used to initiate jobs,
                                                                                        query their results or
                                                                                        program a more complex
                                                                                        job execution sequence.
                Defining                                      Yes it is possible in a   Using the data stage
                calendar and                                  very sophisticated        Director it is possible to
                using it for                                  manner                    schedule the jobs
                ad-hoc
                scheduling




Wipro Confidential                                                             Page 27 of 37
ERP Data Warehouse



                Performance                                       Provides more control to No special performance
                monitoring of                                     user through more         monitor tool but
                ETL process                                       attributes, for better    developers can closely
                                                                  monitoring                observe the running jobs
                                                                                            in the Monitor Window to
                                                                                            provide run-time
                                                                                            feedback on user-
                                                                                            selected intervals. The
                                                                                            powerful process viewer
                                                                                            estimates rows-per-
                                                                                            second and allows
                                                                                            developers to pinpoint
                                                                                            possible bottlenecks
                                                                                            and/or points of failure.
                Performance                                       It's a strong point of    Can provide Very high
                Options                                           Acta as it gives more     performance. Can
                                                                  parameter for             enhance performance
                                                                  performance               using In-memory hash
                                                                  improvement.              tables, reducing I/O
                                                                                            operations with its built-in
                                                                                            sorting and aggregation
                                                                                            capabilities.
                                                                                            DataStageallows to
                                                                                            bypass ODBC and "talk"
                                                                                            natively to the source
                                                                                            and target structures
                                                                                            using direct calls thereby
                                                                                            increasing performance.
                Specifying the It is possible to load a large set Possible to specificy the Does not suppot
                atomicity of the of records to the target         automaticity of the       atomicity updates.
                updates          database.                        updates
                Security –       Has got good security features Provides good secutity Provides security
                Encryption       and managed through              through repository        features using Data
                                 Repository Manager. No           manager. Does not         Administrator.
                                 Encryption facility.             provide encryption
                                                                  facitlity




Wipro Confidential                                                                Page 28 of 37
ERP Data Warehouse



                 Security and Not Available                           No option to provide      Not Available
                 Access Control                                       LDAP interface
                 using LDAP


 Adaptability Impact analysis It is possible to find out the Provides impact analysis Good impact analysis
              capability      impact on change which needs capability                 capabilities provided by
                                  to be done.                                                   the Metastage Hub
                                                                                                across the integrated
                                                                                                environment. It gives the
                                                                                                entire relationship
                                                                                                associated with an
                                                                                                object.
                 SCD              Requires programatic design to Can be handled using           Requires programatic
                                  update the SCD.                filter and lookup              design to update the
                                                                 transfors.                     SCD.




                 Version/      Supports versioning and                Provides good interface Provides version control
                 configuration configuration management.              to control the versions through distributed
                 management                                                                   repository. (Repository
                                                                                              can exists on either
                                                                                              source or target)


  Support for    Ability to         Supports Flat file, oracle, sql   Only                      Supports heterogenous
    growth       handle various server, DB2, and other ODBC           Oracle8.x,Informix,SQL    sources like Oracle,
                 source types compliant RDBMS.                        Server and DB2            Informix, SQL Server,
                 from flat to files                                   only.Also provide SAP     DB2, flat files, XML, ERP
                                                                      R3 connectivity without   Sources like Oracle Apps,
                 to major
                                                                      any plugins.              SAP R/3, Peoplesoft etc.
                 RDBMS




Wipro Confidential                                                                   Page 29 of 37
ERP Data Warehouse



                Incremental    This needs to be handled in     Yes            Supports Incremental
                upload         mappings manually.                             load. Changed Data
                                                                              Capture captures
                                                                              changes to the
                                                                              operational data and
                                                                              produces Delta Store
                                                                              files.DataStage XE uses
                                                                              these files to update the
                                                                              data warehouse.From a
                                                                              workflow perspective,the
                                                                              warehouse developer
                                                                              defines a Delta Data
                                                                              Store file as an input
                                                                              table within one of the
                                                                              DataStage XE products
                                                                              on a Windows 95/NT
                                                                              platform.
                Support for     One can call external procedure Yes           DataStage supports a
                External loader in the mapping using external                 wide variety of such bulk
                               transformation.                                load utilities either by
                                                                              directly calling a vendor
                                                                              ’s bulk
                                                                              load API or generating
                                                                              the control and matching
                                                                              data file for batch input
                                                                              processing.DataStage
                                                                              developers simply
                                                                              connect a Bulk Load
                                                                              Stage icon to their jobs
                                                                              and then fill in the
                                                                              performance settings
                                                                              that are appropriate for
                                                                              their particular
                                                                              environment.




Wipro Confidential                                                    Page 30 of 37
ERP Data Warehouse



                Intermediate Only generates a temp file        Does not generate        Do not require
                file generation when doing sorting or loading. intermediate file during intermediate files or
                during loading                                 loading.                 secondary storage
                                                                                         locations to perform
                                                                                         aggregation or
                                                                                         intermediate sorting
                                                                                         during loading process.
                Event based     Does not supports "true" work Yes it is possible for do  Supports Event based
                loading         flow mechanism. This can be                              loading
                                done using external schedulers
                                or workflow tools like AppWorks
                                or NT Scheduling or using
                                Mainframe OPC Scheduling
                                tools.
                Support for     Supports Oracle, Informix, SQL Only                     Sybase Adaptive Server ,
                wide range of Server, DB2 etc                   Oracle8.x,Informix,SQL Sybase Adaptive server
                databases for                                   Server and DB2 only.    IQ, Microsoft SQL Server
                storing(Target)                                                         7 via OLE/DB , Microsoft
                information                                                             SQL Server 6.5 via BCP ,
                                                                                        Informix Redbrick,
                                                                                        Teradata, UDB. Bulk
                                                                                        Loaders - Oracle ,
                                                                                        Informix ADO/XPO High
                                                                                        Performance . Ascential
                                                                                        databases- UniVerse,
                                                                                        Unidata. Also XML,e-mail
                                                                                        systems and Web Logs,
                                                                                        ERP data and MQSeries
                                                                                        messages.
                Support for     Supports multi user             Supports multi user     Supports multi user client
                multi-user      development environment.        development             server development
                development                                     environment             environment
                environment




Wipro Confidential                                                             Page 31 of 37
ERP Data Warehouse



 Advance Data Re-usability    Supports re-usability of the     provides various reusable Code Reusability is
Transformation                code by making transformation objects like                  suported. Ascential's
                              reusable.                        Jobs,workflows,dataflows Quality Manager
                                                               etc.                       provides a framework for
                                                                                          developing a self-
                                                                                          contained and reusable
                                                                                          Project which consists of
                                                                                          business rules, analysis
                                                                                          results, measurements,
                                                                                          history and reports about
                                                                                          a particular source or
                                                                                          target environment.
                Support for   Support Built in transformations Support built in functions pre-built functions and
                built in      like aggrigator , filter etc.                               routines are available
                functions




Wipro Confidential                                                            Page 32 of 37
ERP Data Warehouse



                Handling      Does not handle duplicate rows. Possible to handle      Does not handle
                duplicate     To be hanldled programatically duplicate records        duplicate rows. To be
                records                                                               hanldled programatically




                Lookup cache Supports caching of lookup       Possible to define lookup Supports Lookup cache
                              tables.                         cache through lookup
                                                              transformations



Consistency and Global Meta   Using PowerCenter and          Supports Global Meta     MetaBrokers enable the
    re-use      data          PowerMart model it is possible Data                     sharing of meta data
                              to handle global meta data.                             among all of the tools in
                                                                                      the warehouse
                                                                                      environment.With
                                                                                      MetaBrokers, tools can
                                                                                      share meta data without
                                                                                      having to change their




Wipro Confidential                                                           Page 33 of 37
ERP Data Warehouse



                                                                                   internal meta schema to
                                                                                   conform to a common
                                                                                   model.




Compatibility Compatibility Currently PowerCenter Supports Supports EAI tool TIBCO Only IBM MQ Series is
with third party of ETL Tools following EAI vendors IBM MQ as an input .           supported.
     tools       with EAI tools Series, TIBCO, Vitria and
                              webMethods as source/ target
                              for the data.




Wipro Confidential                                                        Page 34 of 37
ERP Data Warehouse



  Licensing &   Server        Licensing Includes following for Provideds evaluation and Information Not availble
    Pricing     Licensing     Basic Version:                     permanent
                              . No ability to add-on             licenses.Which supports
                              PowerMarts                         multiuser environment
                              · No Global Repository             and SAP R3 connectivity.
                              · No centralized monitoring
                              · 1 Server Engine*
                              · 2 Relational Database Source
                              Types
                              · 2 Target Instances
                              · Unlimited Flat File Sourcing
                              · Unlimited Developers
                              . Single CPU
                              Unix Version Costs : US$
                              140 K
                              Windows NT/2000 Ver :
                              US$ 95 K
                Client        There is no separate licensing There is no separate         Information Not availble
                Licensing     for the Client. It Comes along license required for
                              with the server.                   client.
                ODC Licensing No transfers are allowed from                               Information Not availble
                              the client owned software to
                              Wipro. Separate license has to
                              be procured. May be Lab
                              license will do which will be half
                              the cost of the production
                              license




Wipro Confidential                                                             Page 35 of 37
ERP Data Warehouse



    Vendor      2 consecutive Informatica was recently named Acta continues to see
  Information   years of      the 11th fastest-growing       strong growth in data
                profitability technology company in Silicon integration with second
                                Valley by Deloitte & Touche.  quarter revenue growth
                                The ranking resulted from the results up 110%.
                                company’s 10,491 percent
                                revenue growth between 1995-
                                1999.
                Significant     PowerCenter Works with most                            SAP is a reseller of
                third party     of the software,database and                           Ascential’s DataStage
                partner support hardware vendors. Built on                             and DataStage Load
                                most with open system. The                             PACK for SAP BW with
                                product like powerconnect for                          the sole target being
                                DB2 has been brought by                                SAP BW.
                                informatica and supported.
                Global          Has Global presence and has                          Ascential Software
                presence and support most of the continents.                         Corporation is the
                support                                                              leading provider of
                                                                                     Information Asset
                                                                                     Management solutions to
                                                                                     the Global 2000.
                Number of     is around 1300 as of Oct 2001 Has more then 200        More than 1800 as of
                Customers                                   customer as of Oct 2001. Aug' 01




Wipro Confidential                                                          Page 36 of 37
ERP Data Warehouse



                Company          All the informtaion regarding                         Revenue for Ascential
                financial info   the health of the company has                         Software's DataStage®,
                readily          been reported in its website.                         Media360™ and related
                available                                                              product and service
                                                                                       offerings was $27.0
                                                                                       million in the third
                                                                                       quarter, an increase of
                                                                                       14% from $23.6 million in
                                                                                       the third quarter of 2000.
                                                                                       Revenue for these
                                                                                       offerings for the nine
                                                                                       months ended
                                                                                       September 30, 2001 was
                                                                                       $93.9 million, an
                                                                                       increase of 47% over the
                                                                                       $63.8 million in the first
                                                                                       nine months of 2000.
                Company focus Informatica Came to BI market Acta is well positioned to Adds significant meta
                on ETL          with the ETL product and has drive the "data           data management
                segment for the established a major player in integration market" and services to the entire
                future          the market. This product will be coming up as major    datawarehouse,including
                                continue to be the flag ship     player.               ETL. Intend to offer the
                                product despite change in its                          capability for
                                positioning in the BI market                           heterogeneous cross-tool
                                                                                       analysis and query
                                                                                       capabilities.Exploitation
                                                                                       of XML Integration to
                                                                                       enhance e-businesses
                                                                                       communication.Delivers
                                                                                       Key Metabroker
                                                                                       development capabilities
                                                                                       for its customers and
                                                                                       partners.




Wipro Confidential                                                            Page 37 of 37

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:358
posted:9/17/2010
language:English
pages:37
Description: ERP applications have come into existence with a great promise of providing an integrated applications environment that addresses all the issues surrounding uncontrolled growth of stove pipe IS applications and serving full enterprise needs.
Lê Đức Hân Lê Đức Hân BIDV INSURANCE COMPANY
About