Deploying an Operational
Business Intelligence Architecture
The WebFOCUS 7 Data Integration Framework
A White Paper
Table of Contents
1
Introduction
2 2 3 4 4 6
Finding the Sources of Business Intelligence
Recognizing Access and Integration Issues Establishing a BI Integration Framework The Framework’s Data Integration Layer The Data Warehousing Layer An Integrated Portfolio of Integration Technologies
7
WebFOCUS – a Data Integration Framework Optimized for Business Intelligence
A Scalable Platform for Ongoing Flexibility A Metadata Repository That Simplifies Schema Development Powerful Native Query Processing SQL Translation Allows Access Optimization Federation Keeps Data Location Transparent to Users Data Usage Analysis Enables Performance Tuning Preemptive Query Governing Assures Compliant Operations Integrated Data Migration and Synchronization Enables Data Sharing Non-DBMS, Business Format Transformation
8 9 10 11 11 11 13 14 15 17
Service Adapters Encapsulate Processes as Data Sources
19
Choose the Right Framework for Operational Business Intelligence
Introduction
Operational business intelligence investments are only as good as the data they deliver, and unfortunately, many enterprises believe that business intelligence (BI) begins with data in a database or warehouse and ends with a management report. However, scalable, effective BI systems optimize data access and processes at the source – the vital information systems that run the business. When properly constructed, operational business intelligence systems provide critical input and feedback to everyone in the enterprise and beyond – not just to managers and business analysts. A successful operational system is built on a business intelligence data integration framework that enables an organization to create a repeatable, consistent approach for acquiring, transforming, integrating, and delivering data to applications and end users who need it. Information Builders’ WebFOCUS provides a powerful integration framework combined with advanced features to place highly effective business intelligence within reach of most organizations. By maximizing data at all of its sources, WebFOCUS helps organizations improve the effectiveness of their existing BI tools as well as enhance the quality of the information they use for driving business operations.
1
Information Builders
Finding the Sources of Business Intelligence
In most people’s minds, business intelligence begins with a data warehouse or datamart and ends with an online analytical processing (OLAP) tool or report. However, the data itself doesn’t originate in a repository and most of an organization’s data never ends up in one. Data actually originates in operational transaction systems, packaged and legacy applications, individual’s spreadsheets, and from sources external to the organization. These origins are invisible to the traditional BI user or implementer, who have no sense of the time, cost, and complexity involved in gathering and preparing disparate data to make it available and useful. There is a better way to understand business intelligence. When examined as an end-to-end process, traditionally BI has relied upon an elaborate infrastructure comprised of these transaction systems, applications, multiple sources, electronic data exchanges, and many disparate, often labor-intensive processes. These processes must access, gather, reconcile, and prepare data to deliver an accurate view of the business to BI users. For example, data for the sales analysis application of a large automobile manufacturer originates in Excel spreadsheets, forwarded from all over the globe. Each spreadsheet’s contents and structure differs depending on the country of origin. All of these diverse spreadsheets must be reformatted, reconciled, and loaded, by hand, into a data store so that the data is available for monthly sales analysis processes.
Recognizing Access and Integration Issues
Today’s business intelligence needs often demand that source data be available in real time, as well as for historical analysis. They also demand that the resulting information be consistent and accurate. Therefore, accessing data for use in BI actually begins where source data is created or captured – in operational transaction systems, packaged applications, and legacy applications, as well as electronic business processes through which businesses exchange data. Data access is critically important because the more robust the data-access capabilities of a BI system, the more easily, cost-effectively, and consistently information may be produced and delivered to all of the information consumers who need it. Yet as Gartner points out, data integration issues consume a significant majority of the effort expended in a BI project. “Designing a repeatable process by which data is acquired from
2
Deploying an Operational Business Intelligence Architecture
operational systems, transformed, integrated, and delivered to the data warehouse is technically challenging.”1 All of these data origins store, represent, and exchange data differently, typically in mediums and formats that are incompatible with easy access, representation, or manipulation by business intelligence consumers.
Establishing a BI Integration Framework
The ability of BI tools to exploit individual source interfaces is critical to accessing quality data from its original sources. Sometimes applications need variable-length virtual storage access method (VSAM) records or hierarchical data in IMS databases or extensible markup language (XML) files. Sometimes integration requires access to systems through messaging interfaces like WebSphere MQ. Sometimes data isn’t directly accessible at all, and an application must access stored procedures, API calls, or Web services instead. Rather than seeking a single silver bullet to resolve a wide range of integration challenges, organizations are better served by establishing a unified BI data integration framework. The framework’s purpose is to enable BI tools to interconnect, prepare, and aggregate data into a unified and meaningful representation for information consumers. In addition to efficiently populating data marts and warehouses, an integration framework can establish (and maintain) real-time links to production data sources, enabling IT teams to avoid the costly process of extracting, modeling, and loading data into separate reporting databases. A data integration framework can also contribute to improved data quality since it is easier to match data against third-party name, address, and location databases; merge information from disparate sources into the same information structure, and eliminate duplicate, null, and outlier values. By employing a unified metadata layer, the BI integration framework makes all required data available, accessible, and meaningful at all levels of an organization – no matter where the data exists, how it’s stored, or the latency involved in its access. Only a robust framework can overcome data availability and accessibility gaps for a complete, consistent, and dependable view of the business.
1 Friedman, Ted and Strange, Kevin H., “Architecture: The Foundation of Business Intelligence,” Gartner Research,
AV-226453, April 14, 2004.
3
Information Builders
The Framework’s Data Integration Layer
A robust data integration framework can be represented as a layered portfolio of integration software technologies and tools that reside underneath visible BI tools. Based on the overall requirements of business intelligence, the framework begins by accessing operational databases, files, operational application systems, document repositories, electronic data interchange (EDI) and other business-to-business data origins and wrapping them in unified metadata. The framework can integrate data from these sources in three ways:
I I
By direct access to the operational data stores or files servicing operational systems Using the operational systems’ own application functions to access data on behalf of the BI information consumer; this is particularly important for information consistency, for example, if the same data is being accessed for both analytical and operational purposes Using a relational staging database that refreshes and populates warehouse data as frequently as users’ needs dictate, either for real-time use or historical analysis
I
While all of these methods deliver high-quality data and metadata that supports users’ needs, a data integration framework that offers selective direct access also delivers higher efficiency and performance. Even when the framework must rely on application functions or staged data integration techniques to access data, by having these processes reside within a single infrastructure, the framework can utilize common access paths, metadata definitions, configurations, tuning, and debugging tools. A common, reusable infrastructure simplifies management and improves performance over having a collection of disparate, separate-access tools. The data integration layer will also generate a comprehensive metadata representation that can be captured and used in other framework layers to ensure proper data delivery and support system administrators and BI users. Building an efficient data integration process is a key component to quickly delivering powerful, cost-effective business intelligence solutions.
The Data Warehousing Layer
The framework’s data warehouse layer consists of analytical services that increase business users’ visibility into their areas of responsibility. Information warehouses are usually developed in phases, over time, and are designed to meet at least two basic requirements.
4
Deploying an Operational Business Intelligence Architecture
I
Enterprise data warehouses must define business variables that are true across the whole enterprise. For example, all financial information must be based on the organization’s financial year. Or all data derived from business-to-business exchanges or external sources must adhere to the organization’s overall information security policies. When this information is available, it will often identify gaps or reveal relevant facts that otherwise would have remained obscured by overlying data. They must include, within their structure, business variables and metrics that are required for answering any analysis questions posed by the different business groups they serve. For example, organizations often develop a sequence of data marts containing information related to a specific subject area – such as competitive information or pricing – or a functional discipline, such as marketing or finance.
I
A BI data integration framework can further improve BI performance in several ways. The framework can aid data warehouse tuning by enabling administrators to see who is using what data and how frequently it is accessed, which in turn facilitates update prioritization, elimination of dormant data, and real-time initiatives. A framework can also automate warehouse administration with features that include operator alerts for exceptions, automated exception handling based on predefined business rules, DBA alerts for key service-outages, and a production schedule for running appropriate tasks, such as scheduling warehouse updates and refreshes.
5
Information Builders
An Integrated Portfolio of Integration Technologies
Data access and integration capabilities are crucial to operational business intelligence architecture. Organizations planning to deploy a BI system, therefore, should take a close look at a potential system’s strengths in access and integration technologies. Information Builders’ WebFOCUS solution is a highly scalable enterprise business intelligence platform whose integration framework allows business users to effortlessly access, query, analyze, share, and disseminate information – on demand, where and when needed.
Information Resources Messages/ Objects Legacy DBMS/Files Documents/ Formats Processes/ Services
RDBMS
Applications
Adapter Manager
Adapter Framework Resource Governor Resource Analyzer Metadata Repository SQL Translator Data Manipulation Engine/Optimizer DataMigrator Communications Subsystem Data Server
Interfaces
OLE DB
JDBC
ODBC
API/SQL
WebFOCUS data integration architecture.
6
Deploying an Operational Business Intelligence Architecture
WebFOCUS – a Data Integration Framework Optimized for Business Intelligence
The WebFOCUS data integration framework comprises an entire portfolio of integration technologies, for enabling data access and manipulation in a uniform way, regardless of its form, format, or location. The framework provides comprehensive, native access to more than 280 sources – including relational and legacy data, data in enterprise applications, data staged in warehouses or marts, and real-time data from operational systems – on any platform.
WebFOCUS 7’s architecture, integration, and simplicity permeate every level of the organization and make any data available, accessible, and meaningful to everyone who needs it, when and how they need it.
To support business and technical users, WebFOCUS employs different access techniques and tools to produce high-quality data, a reusable infrastructure, and comprehensive metadata. Access tools include packaged adapters; extract, transform, and load (ETL) technologies; realtime transformation; enterprise information integration (EII); and Web services. Access
7
Information Builders
techniques, which vary according to need, can include using ETL to extract, transform, and load data directly into a data warehouse or mart; directly accessing operational data stores or the files that service an operational system’s own application functions to access data; and tricklefeeding a data warehouse to populate and refresh it, using real-time transformation.
A Scalable Platform for Ongoing Flexibility
The WebFOCUS data architecture is based upon the WebFOCUS data server, an SQL-based, multi-threaded, highly scalable data manipulation engine that serves as the configuration and run-time environment for WebFOCUS adapters. The WebFOCUS framework enables high scalability, optimizing data access and manipulation at the individual data adapter level, as well as across data resources, individual servers, or server clusters. Using universal adapters, the WebFOCUS framework can access data at different levels of latency and cost-effectively integrate it – without custom coding. Adapters are not limited to staged data platforms; instead, they provide direct access to data in packaged applications, legacy systems, EAI environments, and other technologies. They also allow dynamic drillthrough to detail data. Intelligent adapters will handle a wide variety of applications, formats, databases, and protocols to help create a highly flexible architecture that is fully configurable for both synchronous and asynchronous messages. This flexibility enables data architects to design a BI solution architecture that easily meets the unique information needs of a range of users while preserving the flexibility to adapt as these needs change. With built-in scalability, even point projects based on standards like J2EE, .NET, XML, SOAP, and EDI can be aggregated into a service-oriented architecture to accommodate needs ranging from pulling bulk data and providing full ETL capabilities to incorporating data as part of a look-up. The WebFOCUS framework also incorporates complete ETL functionality, allowing an organization to scale its capabilities while protecting production applications and preserving resources. Automated functionality significantly simplifies data warehouse creation, maintenance, and expansion, and also streamlines application loading. Support for Web services enables the framework to dynamically integrate live data as an integral part of a business process. Further scalability may be achieved through the framework’s ability to leverage server clusters and distribute workloads across many dissimilar machines.
8
Deploying an Operational Business Intelligence Architecture
A Metadata Repository That Simplifies Schema Development
The WebFOCUS data server provides a fully integrated Windows-based or Web-based metadata console for deploying and configuring adapters and defining schema information for all information resources supported by WebFOCUS adapters. Schema information is created using automated tools that introspect and generate metadata information from the underlying databases, files, copybooks, documents, or Web services. Once created, metadata information may be made generally available to all users or associated with specific application views. WebFOCUS metadata schema information may be easily cleansed, enriched, or enhanced using integrated graphic tools contained within its data management console. The ability to create user-defined and computed virtual fields within metadata schemas eliminates the need for additional external data cleansing tools or having to write complex SQL logic within applications or queries.
WebFOCUS’ metadata configuration console automatically generates schema information that may be enriched for data quality purposes.
9
Information Builders
Powerful Native Query Processing
The data manipulation engine of WebFOCUS is fully SQL-compliant and supports read/write operations against all relational data sources, VSAMs, IMS database management systems (IMS/DBs), and Adabas data sources, as well as read operations against any data source exposed through WebFOCUS adapters. At runtime, the WebFOCUS data manipulation engine generates the SQL needed for retrieving specific data. A powerful translation engine automatically translates SQL into the native data manipulation language necessary to access legacy database systems or files. An internal optimizer evaluates the best method of executing SQL operations, based on important data characteristics such as cardinality, key fields, and other aspects. As a result, the WebFOCUS data manipulation engine supports both homogeneous and heterogeneous join operations that can be performed on the fly.
A Web console enables optimization characteristics to be defined for efficient query processing.
10
Deploying an Operational Business Intelligence Architecture
SQL Translation Allows Access Optimization
At the heart of WebFOCUS’ ability to access heterogeneous data sources is its SQL translation engine, an ANSI SQL ’92-compliant data manipulation engine. The translation engine converts SQL received from applications and queries into an internal, canonical form that is supported by all WebFOCUS adapters. This form then translates SQL into specific SQL dialects supported by proprietary RDBMSs. The canonical form can also translate SQL into the native data manipulation language of legacy databases and files. Automatic pass-through (APT) or non-APT optimization decisions are made on the fly by the WebFOCUS data server at runtime. With multiple modes of operation, SQL translator enables data architects to optimize data access strategies depending on the specific source data. APT operations eliminate translation overhead, enabling SQL to be passed unaltered to an underlying RDBMS that controls all aspects of SQL operations performed against it.
Federation Keeps Data Location Transparent to Users
WebFOCUS data servers and adapters reside locally with the information resources they support. Once schema information is defined to WebFOCUS, it can be easily aggregated in one or more location for data-location transparency. An application or end-user query accesses local schema information and sees all available data as local relational tables. At runtime, data requests are disaggregated and recast into the specific SQL or native requests necessary to access and retrieve data from the appropriate data sources. Answer sets are merged as necessary and then returned to the requesting application or user. Retrieved data may also be cached for reuse.
Data Usage Analysis Enables Performance Tuning
In many cases, database and file design is not optimal for query usage. And frequently, actual data usage does not align with intended usage. WebFOCUS Resource Analyzer can be used to monitor how local or distributed data is stored, accessed, and used, giving administrators visibility into usage patterns and enabling them to better tune their data environment for
11
Information Builders
improving performance. Resource Analyzer monitors more than 200 independent processing statistics, such as I/Os, rows retrieved, CPU time, elapsed time, and others, allowing data retrieval to be tuned to meet user and application performance requirements.
Resource Analyzer provides information regarding how data is actually used.
12
Deploying an Operational Business Intelligence Architecture
Preemptive Query Governing Assures Compliant Operations
Optimization tools can enhance performance, but they do not prevent users from writing incorrect ad hoc queries or application programmers from writing SQL operations that derail database performance. WebFOCUS Resource Governor, a rules-based preemptive governor, uses the statistics gathered by Resource Analyzer to create rules for cost-basis governing. Units of cost and thresholds are defined using graphic tools and an English-like language. Resource Governor then uses patented pattern-matching techniques to learn the characteristics of queries being governed and generates governing rules based upon defined thresholds and costs. At runtime, when a specific query exceeds a threshold, its execution may be disallowed or deferred for batch execution at a later point in time.
Resource Governor’s governing-rules-creation graphical interface is easy to use.
13
Information Builders
Integrated Data Migration and Synchronization Enables Data Sharing
Unlike other data integration architectures, WebFOCUS incorporates a fully integrated ETL tool within its native architecture. Data Migrator utilizes the WebFOCUS data manipulation engine, data management console, and adapters to perform batch-oriented ETL operations between source and target data sources. In addition, WebFOCUS metadata information can be shared between applications and end users that access data, as well as shared by utilizing data warehousing or data synchronization staff implementing ETL scenarios to stage data for applications, data warehouses, or data marts.
Data Migrator enables the graphic design of data extract, transformation, and loading scenarios.
14
Deploying an Operational Business Intelligence Architecture
Non-DBMS, Business Format Transformation
Data in common e-business formats, such as EDI, SWIFT, and FIX must be transformed in SQL-accessible data before SQL-based tools and applications can use it. WebFOCUS’ adapter manager makes e-business data sources easily accessible for SQL-based tools and offers integrated graphic tools for data inspection and metadata creation. WebFOCUS’ adapter transformer enables business functions and source formats to be easily mapped as XML schemas. Application explorer is a graphic WebFOCUS tool that enables users to inspect WebFOCUS packaged application, transaction, and e-business format sources.
Application explorer is used to introspect applications and transaction systems.
15
Information Builders
Once functions and source formats are mapped as XML schemas, application explorer enables them to be published as Web services. Web services published using application explorer are exposed as relational schemas to any application, end user, or WebFOCUS ETL process.
Adapter transformer is used to graphically map and enrich messages for use as data.
16
Deploying an Operational Business Intelligence Architecture
Service Adapters Encapsulate Processes as Data Sources
In many cases, the data required by an end user or application is not stored as a unique entity within a DBMS or file, but is instead the result of aggregating the results of many operations performed in a specific sequence. For example, “customer” data, as needed by a customer service representative, might mean the total business relationship a customer has with the enterprise. To present such data, an application might contain many operations that access and aggregate data from multiple sources. This assumes, of course, that the application being written actually has access to all the necessary sources and the application implementer has the know-how to manipulate such sources to produce the necessary data. Creating and maintaining the code that generates this customer information might require accessing specific business functions with SAP and Siebel applications, combined with a lookup performed against a VSAM file, joined with specific data from an Oracle table. This would be difficult code for even an application programmer to create, and certainly beyond the skills of most end users. The WebFOCUS adapter manager enables complex business processes, such as the given example, to be graphically defined and exposed as a service adapter. Service adapters may then be reused by any application, tool, or user – saving programming time, eliminating introduced errors, and reducing the cost of new application deployments.
17
Information Builders
Once defined as a service adapter, the entire process can also be published as a Web service, then consumed and exposed as a relational data source through the WebFOCUS metadata repository.
Entire processes that gather and enrich data may be graphically defined, then published as service adapters – or Web services – for consumption by the WebFOCUS data server.
18
Deploying an Operational Business Intelligence Architecture
Choose the Right Framework for Operational Business Intelligence
Improving business performance relies on better business information and its consistent application. With a business intelligence data integration framework that encompasses business data from its source through reconciliation and preparation for use, organizations can significantly improve the quality of their information and processes. Information Builders’ WebFOCUS data integration framework underlies the BI systems of many of the world’s leading organizations, where improved business intelligence benefits the organizations’ profitability, decreased costs, improved productivity, and enhanced market agility. For more information about how Information Builders’ WebFOCUS and its data integration framework can benefit your organization, visit www.informationbuilders.com.
19
Information Builders
Sales and Consulting Offices
North America
United States I Atlanta, GA (770) 395-9913 * I Baltimore, MD Consulting: (703) 247-5565 I Boston, MA (781) 224-7660 * I Channels (800) 969-4636 I Charlotte, NC Consulting: (704) 494-2680 * I Chicago, IL (630) 971-6700 * I Cincinnati, OH (513) 891-2338 * I Cleveland, OH (216) 520-1333 * I Dallas, TX (972) 490-1300 * I Denver, CO (303) 770-4440 * I Detroit, MI (248) 743-3030 * I Federal Systems, DC (703) 276-9006 * I Hartford, CT (860) 249-7229 I Houston, TX (713) 952-4800 * I Los Angeles, CA (310) 615-0735 * I Metropolitan, NY Sales: (212) 736-7928 * Consulting: (212) 736-4433, ext. 4443 I Minneapolis, MN (651) 602-9100 * I New Jersey* (973) 593-0022 I Orlando, FL (407) 804-8000 * I Philadelphia, PA (610) 940-0790 * I Pittsburgh, PA (412) 494-9699 I St. Louis, MO (636) 519-1411 * I San Jose, CA (408) 453-7600 * I Seattle, WA (206) 624-9055 * I Washington, DC Sales: (703) 276-9006 * Consulting: (703) 247-5565 Canada Information Builders (Canada) Inc. I Calgary (403) 538-5415 I Ottawa (613) 233-0865 I Montreal* (514) 334-0404 I Toronto* (416) 364-2760 I Vancouver* (604) 688-2499 Mexico Information Builders Mexico I Mexico City 85-55-91-71-20-54
Australia
Information Builders Pty. Ltd. I Melbourne 61-3-9631-7900 I Sydney 61-2-8223-0600
I
I
Europe
I
I
I
I
I
I
I
I
Belgium Information Builders Belgium Brussels 32-2-7430240 France Information Builders France S.A. Paris 33-14-507-6600 Germany Information Builders (Deutschland) Dusseldorf 49-211-522877-0 Eschborn 49-6196-77576-0 Munich 49-89-35489-0 Stuttgart 49-711-7287288-0 Netherlands Information Builders (Netherlands) B.V. Amsterdam 31-20-4563333 Portugal Information Builders Portugal Lisbon 351-217-230-720 Spain Information Builders Iberica S.A. Barcelona 34-93-344-32-70 Bilbao 34-94-425-72-24 Madrid 34-91-710-22-75 Switzerland Information Builders Switzerland AG Wallisellen 41-1-8394949 United Kingdom Information Builders (UK) Ltd. London 44-845-658-8484
I
I
I
I
I
I
I
I
I
I
I
Representatives
I
Israel NESS A.T. Ltd. Tel Aviv 972-3-5483638 Italy Selesta G C Applications S.P.A. Genova 39-010-64201-224 Milan 39-02-2515181 Torino 39-011-5513-211 Japan K.K. Ashisuto Osaka 81-6-6373-7113 Tokyo 81-3-3437-0651 Korea Unitech Infocom Co. Ltd. Seoul 82-2-3477-4456 Norway InfoBuild Norway Oslo 47-23-10-02-80 Philippines Beacon Frontline Solutions, Inc. 63-2-750-1972 Singapore Automatic Identification Technology Ltd. 65-6286-2922 South Africa Fujitsu Services (Pty.) Ltd. Johannesburg 27-11-2335911 Sweden Cybernetics Business Solutions AB Solna 46-7539900 Taiwan Galaxy Software Services Taipei 886-22-3897722 Thailand Datapro Computer Systems Co. Ltd. Bangkok 662-684-8484 Turkey Istanbul Key Soft Ltd. 90-216-428-5933 Veripark 90-212-283-9123** Venezuela InfoServices Consulting Caracas 58-212-763-1653
I
I
I
I
I
I
Austria Raiffeisen Informatik Consulting GmbH Vienna 43-12-1136-3870 Brazil InfoBuild Brazil Ltda. São Paulo 55-11-3285-1050 China InfoBuild China, Inc. Shanghai 86-21-50805432 Finland InfoBuild Oy Espoo 358-9-7250-2250 Greece Applied Science Athens 30-210-699-8225 Guatemala IDS de Centroamerica Guatemala City 502-361-0506 Gulf States I Bahrain I Kuwait I Oman I Qatar I Saudi Arabia I Yemen I United Arab Emirates Al-Gosaibi Information Systems 973-274-090
Toll-Free Numbers
I
I
Sales and Information (800) 969-INFO ISV, VAR, and SI Partner Information (800) 969-4636
**Training facilities are located at these branches; additional locations are available. **Authorized to sell iWay Software only.
Corporate Headquarters Two Penn Plaza, New York, NY 10121-2898 (212) 736-4433 Fax (212) 967-6406 www.informationbuilders.com askinfo@ibi.com Canadian Headquarters 150 York St., Suite 1000, Toronto, ON M5H 3S5 (416) 364-2760 Fax (416) 364-6552
Copyright © 2005 by Information Builders, Inc. All rights reserved. [40] All products and product names mentioned in this publication are trademarks or registered trademarks of their respective companies.
DN7504885.0305
For International Inquiries
+1(212) 736-4433 Printed in the U.S.A. on recycled paper