Managing Multidimensional Data Marts with Visual Warehouse and DB2 OLAP Server
Shared by: blacksadow2
-
Stats
- views:
- 14
- posted:
- 10/10/2011
- language:
- English
- pages:
- 380
Document Sample


Managing Multidimensional Data Marts
with Visual Warehouse and DB2 OLAP Server
Thomas Groh,
Ann Valencic, Bhanumathi Dhanaraj, Hanspeter Furrer, Karl-Heinz Scheible
International Technical Support Organization
http://www.redbooks.ibm.com
SG24-5270-00
SG24-5270-00
International Technical Support Organization
Managing Multidimensional Data Marts
with Visual Warehouse and DB2 OLAP Server
December 1998
Take Note!
Before using this information and the product it supports, be sure to read the general information in
Appendix A, “Special Notices” on page 343.
First Edition (December 1998)
This edition applies to Version 1.0.1 of IBM DB2 OLAP Server and Version 5.2 of IBM Visual
Warehouse.
Comments may be addressed to:
IBM Corporation, International Technical Support Organization
Dept. QXXE Building 80-E2
650 Harry Road
San Jose, California 95120-6099
When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the
information in any way it believes appropriate without incurring any obligation to you.
© Copyright International Business Machines Corporation 1998. All rights reserved
Note to U.S Government Users – Documentation related to restricted rights – Use, duplication or disclosure is
subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp.
Contents
Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .ix
Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
The Team That Wrote This Redbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Comments Welcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
Part 1. Building an OLAP Data Mart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Driving Factors for Business Intelligence . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 What Is OLAP and Why Is It So Successful? . . . . . . . . . . . . . . . . . . . . 5
Chapter 2. Planning a Business Intelligence Project . . .. . . . . .. . . . . .9
2.1 Who Is Needed for the Project? . . . . . . . . . . . . . . . . . .. . . . . .. . . . . 10
2.1.1 Business Project Group . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . 11
2.1.2 Development Project Group . . . . . . . . . . . . . . . . .. . . . . .. . . . . 12
2.2 The Development Process . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . 14
2.3 Success Factors for a Business Intelligence Solution . .. . . . . .. . . . . 16
Chapter 3. Selecting the Appropriate Architecture . . . . . . . . . . . . . . . 19
3.1 Available Architectures for OLAP Data Marts . . . . . . . . . . . . . . . . . . . 19
3.2 Architecture and Concepts of DB2 OLAP Server and Essbase . . . . . . 22
3.3 The End-to-End Architecture of a Business Intelligence Solution . . . . 27
3.3.1 The Architecture Building Blocks . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.2 Additional Requirements for an End-to-End Architecture . . . . . . 35
3.4 Visual Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.1 Data Sources Supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.2 Data Stores Supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.3 End User Query Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.4 Metadata Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5 The Architecture of Visual Warehouse . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5.1 Visual Warehouse Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5.2 Visual Warehouse Administrative Clients . . . . . . . . . . . . . . . . . . 38
3.5.3 Visual Warehouse Agents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5.4 Visual Warehouse Control Database . . . . . . . . . . . . . . . . . . . . . 38
3.5.5 Visual Warehouse Target Databases . . . . . . . . . . . . . . . . . . . . . 39
Chapter 4. Implementing a Multidimensional Model . . . . . . . . . . . . . . 41
4.1 Introduction to the TBC Sales Model . . . . . . . . . . . . . . . . . . . . . . . . . 41
iii
4.2 Building the Database Outline Manually . . . . . . . . . . . . . . . . . . . . . . . 43
4.3 Building the Database Outline Dynamically . . . . . . . . . . . . . . . . . . . . 53
4.3.1 Building the Product Dimension (Using Level References) . . . . . 56
4.3.2 Building an Alternative Aggregation Path . . . . . . . . . . . . . . . . . . 65
4.3.3 Building the Market Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3.4 Building the Time Dimension (Using Parent/Child References) . . 71
4.3.5 Copying Dimensions, Members, and Outlines. . . . . . . . . . . . . . . 74
4.4 Loading the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.4.1 Loading Data from a Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.4.2 Loading Data from a Flat File . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.5 Calculating the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Chapter 5. Populating the Multidimensional Model . . . . . . . . . .. . . . . 91
5.1 Preparing Launch Tables Using Visual Warehouse . . . . . . . . .. . . . . 91
5.1.1 Initializing Visual Warehouse . . . . . . . . . . . . . . . . . . . . . .. . . . . 92
5.1.2 Defining the Data Sources . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 93
5.1.3 Defining the TBC Target Data Warehouse . . . . . . . . . . . .. . . . 101
5.1.4 Defining the Business Views for the Launch Tables . . . . .. . . . 102
5.2 Visual Warehouse Hints and Tips . . . . . . . . . . . . . . . . . . . . . . .. . . . 110
5.3 Business Views Used for the TBC Sales Model . . . . . . . . . . . .. . . . 113
5.4 Automating the Process Using Visual Warehouse Programs . .. . . . 116
5.4.1 Introduction to Visual Warehouse Programs . . . . . . . . . . .. . . . 116
5.4.2 Understanding VWP Templates . . . . . . . . . . . . . . . . . . . .. . . . 117
5.4.3 Defining a Business View That Uses a VWP. . . . . . . . . . .. . . . 123
5.4.4 Developing Custom VWP Templates . . . . . . . . . . . . . . . .. . . . 133
Chapter 6. A Closer Look at Calculating the OLAP Database . .. . . . 135
6.1 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 135
6.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 135
6.3 Calculation Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 142
6.4 Building an Outline Formula . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 144
6.5 Outline Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 146
6.6 Two-Pass Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 149
6.7 Intelligent Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 150
6.8 Dynamic Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 151
6.8.1 Dynamic Calculation Considerations. . . . . . . . . . . . . . . . .. . . . 152
6.8.2 Dynamic Calculation or Dynamic Calculation and Store . .. . . . 154
6.8.3 Effects of Dynamic Calculation . . . . . . . . . . . . . . . . . . . . .. . . . 154
6.9 Creating a Calculation Script . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 156
6.9.1 Calculation Script Syntax . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 156
6.9.2 Using the Calc Script Editor . . . . . . . . . . . . . . . . . . . . . . .. . . . 157
6.9.3 Grouping Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 158
6.9.4 Substitution Variables in Calculation Scripts . . . . . . . . . . .. . . . 158
iv Visual Warehouse & DB2 OLAP Server
Chapter 7. Partitioning Multidimensional Databases . . . . . . . . . . . . . 161
7.1 Replicated Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.1.1 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.1.2 Implementing a Replicated Partition for the TBC Sales Model . 164
7.1.3 Replicating the Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.2 Transparent Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
7.2.1 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
7.2.2 Advantages and Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . 181
7.2.3 Implementing Transparent Partitions for the TBC Sales Model . 182
7.3 Linked Partitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.3.1 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.3.2 Implementing Linked Partitions for the TBC Sales Model . . . . . 192
Part 2. Managing an OLAP Data Mart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Chapter 8. Ongoing Maintenance and Updates of the Cube . . .. . . . 211
8.1 Cleaning Up before Loading the Model. . . . . . . . . . . . . . . . . . .. . . . 211
8.2 Changing the Outline with Dynamic Dimension Build . . . . . . . .. . . . 214
8.3 Considerations for Dynamic Dimension Build . . . . . . . . . . . . . .. . . . 214
8.4 Backup of the Cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 216
8.5 Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 217
8.6 Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 218
Chapter 9. Performance . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 219
9.1 Tuning Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 220
9.2 Block Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 220
9.3 Review of Block Sizes for the TBC Sales Model . . . . . .. . . . . .. . . . 221
9.4 Review of the Number of Stored Dimension Members .. . . . . .. . . . 226
9.5 DB2 OLAP Server Parameters . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 230
9.5.1 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 230
9.5.2 DB2-Specific Parameters . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 231
9.5.3 Cache Size Tuning . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 232
9.6 The Relational Anchor Dimension and Performance . . .. . . . . .. . . . 235
9.7 Tuning the Data Load . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 236
9.8 Tuning the Calculation . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 238
9.8.1 Reviewing the Status of a Running Calculation. . .. . . . . .. . . . 239
9.8.2 Defining Members as Dynamic Calc and Store . . .. . . . . .. . . . 240
9.8.3 Time and Accounts Dimension Tags. . . . . . . . . . .. . . . . .. . . . 241
9.8.4 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 241
9.8.5 Time As a Sparse Dimension . . . . . . . . . . . . . . . .. . . . . .. . . . 241
9.8.6 Large Database Outlines . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 241
9.8.7 Intelligent Calculation . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . 242
9.8.8 Cross-Dimensional Operators. . . . . . . . . . . . . . . .. . . . . .. . . . 242
v
9.8.9 Running REORG and RUNSTATS . . . . . . . . . . . . . . . . . . . . . . 242
9.9 Block and Cell Calculation Order . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
9.10 Review of the Database Information for a Cube . . . . . . . . . . . . . . . 246
9.11 Using Partitions to Improve Performance . . . . . . . . . . . . . . . . . . . . 248
Chapter 10. Problem Determination . . . . .. . . . . .. . . . .. . . . . .. . . . 251
10.1 Windows NT . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . .. . . . . .. . . . 251
10.2 DB2 Universal Database . . . . . . . . . . .. . . . . .. . . . .. . . . . .. . . . 252
10.3 Visual Warehouse . . . . . . . . . . . . . . . .. . . . . .. . . . .. . . . . .. . . . 256
10.4 DB2 OLAP Server . . . . . . . . . . . . . . . .. . . . . .. . . . .. . . . . .. . . . 259
10.5 Other Components . . . . . . . . . . . . . . . .. . . . . .. . . . .. . . . . .. . . . 260
Chapter 11. Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
11.1 Security Layers of the OLAP Data Mart . . . . . . . . . . . . . . . . . . . . . 262
11.2 Visual Warehouse Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Part 3. Accessing an OLAP Data Mart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Chapter 12. OLAP Analysis Using the Spreadsheet Add-in. . . . . . . . 269
Chapter 13. User-Defined Attributes . . . . . . . . . . . . . . . .. . . . . .. . . . 283
13.1 Rules for User-Defined Attributes . . . . . . . . . . . . . . . .. . . . . .. . . . 283
13.2 Creating User-Defined Attributes . . . . . . . . . . . . . . . .. . . . . .. . . . 283
13.3 Using UDAs for Member Selection . . . . . . . . . . . . . . .. . . . . .. . . . 287
13.4 Using UDAs during Data Load to Flip the Sign . . . . . .. . . . . .. . . . 292
13.5 Using UDAs in Calculation Scripts . . . . . . . . . . . . . . .. . . . . .. . . . 293
Chapter 14. SQL Drill-Through . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
14.1 Installation Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
14.2 A Brief Description of the Architecture . . . . . . . . . . . . . . . . . . . . . . 296
14.2.1 Server SQL Drill-Through . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
14.2.2 Client SQL Drill-Through. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
14.2.3 The Initialization File (SQLDRILL.INI) . . . . . . . . . . . . . . . . . . . 296
14.3 Enabling SQL Drill-Through for the TBC Sales Model. . . . . . . . . . . 297
14.4 Using a Hierarchy Table for SQL Drill-Through . . . . . . . . . . . . . . . . 311
Chapter 15. Using SQL to Access the DB2 OLAP Server Data Store 313
15.1 DB2 OLAP Server Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
15.2 DB2 OLAP Server Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
15.3 Views Created by DB2 OLAP Server for SQL Access . . . . . . . . . . . 315
15.3.1 Querying the Cube Catalog . . . . . . . . . . . . . . . . . . . . . . . . . . 316
15.3.2 Querying the Cube View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
15.3.3 Querying the Dimension Views . . . . . . . . . . . . . . . . . . . . . . . . 319
15.3.4 Querying Fact View and Star View . . . . . . . . . . . . . . . . . . . . . 323
vi Visual Warehouse & DB2 OLAP Server
15.3.5 Querying the UDA Views . . . . . . . . . . . . . . . . . . . . . . . .. . . . 329
15.3.6 Other Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 330
15.4 Advanced SQL against the DB2 OLAP Server Star-Schema .. . . . 332
15.4.1 Traversing a Dimension Hierarchy . . . . . . . . . . . . . . . . .. . . . 332
15.4.2 Tracking Outline Changes . . . . . . . . . . . . . . . . . . . . . . .. . . . 333
15.4.3 Drill-Across from Aggregated to Detailed Data . . . . . . . .. . . . 335
Chapter 16. OLAP Analysis over the Web Using Wired for OLAP . . . 337
16.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
16.2 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
16.2.1 Views and View Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
16.2.2 Template Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
16.2.3 Corporate Report Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
16.2.4 User Access to View Groups and Corporate Report Groups . . 340
16.2.5 Printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Appendix A. Special Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Appendix B. Related Publications. . . . . . . . . . . . . . . . . . . . . . . ...... . 347
B.1 International Technical Support Organization Publications . . . ...... . 347
B.2 Redbooks on CD-ROMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...... . 347
B.3 Other Publications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...... . 347
How to Get ITSO Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
IBM Redbook Fax Order Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
List of Abbreviations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
ITSO Redbook Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
vii
viii Visual Warehouse & DB2 OLAP Server
Figures
1. Iterative Data Mart Development Approach. . . . . . . . . . . . . . . . . . . . . . . . 16
2. Architecture Building Blocks of an OLAP Solution. . . . . . . . . . . . . . . . . . . 19
3. Star-Schema with Primitive Fact Table . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4. Star-Schema with Fact Table and Anchor Dimension . . . . . . . . . . . . . . . . 24
5. Hyperion Essbase and DB2 OLAP Server Architectures. . . . . . . . . . . . . . 26
6. Architecture Building Blocks of a Business Intelligence Solution. . . . . . . . 28
7. Initial Dimensional Model Representation of TBC Sales . . . . . . . . . . . . . . 42
8. Connect to the Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
9. Create a New Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
10. Create a New Application (continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
11. Create a Database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
12. Create a Database (continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
13. Create a Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
14. Create a New Member. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
15. Create Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
16. Define Alias and Store Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
17. Resulting Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
18. Default of Dense Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
19. Dense and Sparse- Possible Configurations . . . . . . . . . . . . . . . . . . . . . . . 52
20. Saving the Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
21. Level References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
22. Generation References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
23. Parent/Child References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
24. The All Products Business View (Sample Contents) . . . . . . . . . . . . . . . . . 56
25. Essbase Application Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
26. Define SQL Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
27. Result in Data Prep Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
28. SQL Access Error Message. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
29. Data Prep Editor Icons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
30. Dimension Building Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
31. Dimension Build Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
32. Data Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
33. Product Outline after Loading Product Rules. . . . . . . . . . . . . . . . . . . . . . . 65
34. SQL Definition for Shared Diet Members. . . . . . . . . . . . . . . . . . . . . . . . . . 66
35. Rules File Editor for Diet Product Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 67
36. Outline with Inserted Diet Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
37. Define SQL Window for the Market Dimension . . . . . . . . . . . . . . . . . . . . . 68
38. Dimension Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
39. Data Prep Editor for the Market Dimension Rules . . . . . . . . . . . . . . . . . . . 70
40. Outline for the Market Dimension of the TBC Sales Model . . . . . . . . . . . . 71
ix
41. File Structure to Build the Parent/Child Outline . . . . . . . . . . . . . . . . . . . . . 72
42. Dimension Build Setting for the Year Dimension . . . . . . . . . . . . . . . . . . . . 73
43. Data File Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
44. Define the SQL Input Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
45. Connect to the Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
46. Sample Input Data to Be Loaded. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
47. Mapping of the Year Dimension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
48. Mapping of the Measures Dimensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
49. Replacing Missing Data Values with #MI. . . . . . . . . . . . . . . . . . . . . . . . . . 81
50. Rejecting Records in Data Load Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
51. Associating an Outline with the Load Rules. . . . . . . . . . . . . . . . . . . . . . . . 82
52. Save the Load Rules to the Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
53. Clearing Data in the Cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
54. Loading Data into the Cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
55. SQL Data Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
56. Defining the Input File As Comma Delimited . . . . . . . . . . . . . . . . . . . . . . . 85
57. Flat File Input Showing First Record as Mapping Names . . . . . . . . . . . . . 86
58. Define the Data Load Field Names Record . . . . . . . . . . . . . . . . . . . . . . . . 86
59. Comma Delimited Flat File Data Load Rules. . . . . . . . . . . . . . . . . . . . . . . 87
60. Data Load Using Flat Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
61. Identifying the Calculation Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
62. Visual Warehouse Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
63. Logging on to the Visual Warehouse Desktop . . . . . . . . . . . . . . . . . . . . . . 94
64. Defining a Flat File Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
65. Defining a New File under a Flat File Data Source . . . . . . . . . . . . . . . . . . 96
66. Sample Data from OUTOR.TXT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
67. Defining the File Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
68. Defining a Data Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
69. Setting up ODBC to Read Text Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
70. Defining a Target Warehouse Database . . . . . . . . . . . . . . . . . . . . . . . . . 102
71. Creating a Business View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
72. Modifying the Column Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
73. Specifying Table Creation Options for a Business View . . . . . . . . . . . . . 106
74. Defining Joins between the Source Tables . . . . . . . . . . . . . . . . . . . . . . . 108
75. Modifying the Autogenerated SQL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
76. Viewing Dependencies among the Business Views . . . . . . . . . . . . . . . . 112
77. Visual Warehouse Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
78. Available Visual Warehouse Program Templates . . . . . . . . . . . . . . . . . . 118
79. Visual Warehouse Program Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
80. Agent Site for VWPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
81. VWP Parameter Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
82. Visual Warehouse supplied VWPs - Parameter Popup Window . . . . . . . 122
83. VWP Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
x Visual Warehouse & DB2 OLAP Server
84. Creating a New Business View for a VWP . . . . . . . . . . . . . . . . . . . . . . . 124
85. Creating a New Business View for a VWP (continued) . . . . . . . . . . . . . . 124
86. Business View VWP Parameter Definition. . . . . . . . . . . . . . . . . . . . . . . . 125
87. VWP Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
88. Business View Program Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
89. Scheduling a Business View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
90. Visual Warehouse Cascade Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
91. Business View Promotion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
92. Executing a Business View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
93. Executing a Business View (continued). . . . . . . . . . . . . . . . . . . . . . . . . . 132
94. Outline for Scenario Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
95. Function and Macro Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
96. Verify a Formula. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
97. Essbase Data Block. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
98. Essbase Data Block Combination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
99. Outline with Dynamic Calculation Defined . . . . . . . . . . . . . . . . . . . . . . . . 152
100.Outline Dynamic Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
101.Starting the Calc Script Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
102.Substitution Variable Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
103.Possible Configurations of Transparent and Replicated Partitions . . . . . 163
104.Data Source Outline and Data Target Outline. . . . . . . . . . . . . . . . . . . . . 165
105.Opening the Partition Manager from the Data Source Server. . . . . . . . . 166
106.Creating a New Partition Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
107.Defining the Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
108.Defining the Usernames and Passwords to Be Used in Replication . . . . 168
109.Mapping the Data Source Area to Data Target Area . . . . . . . . . . . . . . . 169
110.Defining the Replicated Area for the Source . . . . . . . . . . . . . . . . . . . . . . 170
111.Defining the Replicated Area for the Source (continued) . . . . . . . . . . . . 171
112.Defining the Replicated Area for the Source (continued) . . . . . . . . . . . . 172
113.Defining the Replicated Area for the Target . . . . . . . . . . . . . . . . . . . . . . 173
114.Defining the Replicated Area for the Target (continued) . . . . . . . . . . . . 174
115.Defining the Replicated Area for Target (continued) . . . . . . . . . . . . . . . 175
116.Mapping the Source Member Name to the Target Member Name . . . . . 176
117.Validating the Partition Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
118.Summary of the Partition Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
119.Saving the Partition Definition to the Servers . . . . . . . . . . . . . . . . . . . . . 178
120.Partition Manager Showing Existing Definitions for Database Simprepl 178
121.Replicating the Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
122.Selecting the Update Option for Replication . . . . . . . . . . . . . . . . . . . . . . 180
123.Replication after Target Database Has Been Updated . . . . . . . . . . . . . 180
124.Copying the Outline from TBC Expanded to TBC_East . . . . . . . . . . . . . 183
125.Outline for the TBC_East Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
126.Outline for the TBC_West Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
xi
127.Defining the Load Rules for Loading Data into TBC_East . . . . . . . . . . . 186
128.Defining a New Transparent Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
129.Defining the Area Mapping for the Transparent Partition . . . . . . . . . . . . 188
130.Area Specific Member Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
131.Alternate Way of Area-Specific Member Mapping for Actual . . . . . . . . . 189
132.Validating the Transparent Partition Definition . . . . . . . . . . . . . . . . . . . . 190
133.Saving the Transparent Partition Definition . . . . . . . . . . . . . . . . . . . . . . . 190
134.Testing the Transparent Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
135.Source and Target for the Linked Partitions . . . . . . . . . . . . . . . . . . . . . . 193
136.Creating a New Partition Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
137.Defining the Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
138.Defining the Source and Target Area Mapping . . . . . . . . . . . . . . . . . . . 196
139.Defining the Actual Member Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . 197
140.Final Member Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
141.Validating the Partition Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
142.Closing and Saving the Partition Definition . . . . . . . . . . . . . . . . . . . . . . 198
143.Initiating the Link between Target and Source Cube in the Spreadsheet 199
144.Selecting the Partition Link to Be Performed . . . . . . . . . . . . . . . . . . . . . 200
145.DB2 OLAP Server Error Showing Full Links Are Not in Place . . . . . . . . 201
146.Spreadsheet for the Linked-to Cube . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
147.Altering the Partition Definition Member Mapping . . . . . . . . . . . . . . . . . 203
148.Importing the Member Mapping from a Text File . . . . . . . . . . . . . . . . . . 204
149.Result of Import of the Member Mapping File . . . . . . . . . . . . . . . . . . . . 204
150.Link to Cube Working Correctly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
151.Area Mapping for Link Partitions for Reverse Link . . . . . . . . . . . . . . . . . 206
152.Member Mapping for Link Partitions for Reverse Link . . . . . . . . . . . . . . 207
153.Clearing Member Combination Data Values before Loading . . . . . . . . . 212
154.Defining How to Add New Data to the Cube . . . . . . . . . . . . . . . . . . . . . . 213
155.Log Showing Changes in the Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
156.Original TBC Sales Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
157.Reviewing the Database Information . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
158.Reviewing the Database Information (continued) . . . . . . . . . . . . . . . . . 223
159.Event Log Showing Block Sizes for Original Outline . . . . . . . . . . . . . . . . 223
160.Final TBC Sales Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
161.Event Log Showing Block Sizes for Final Outline . . . . . . . . . . . . . . . . . . 225
162.Expanded TBC Sales Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
163.Reviewing the Stored Members Using the Command Line. . . . . . . . . . . 228
164.Reviewing the Stored Members Using the Command Line (continued . 229
165.Showing the Database Statistics for the Expanded Cube . . . . . . . . . . . 229
166.Fact Table with Product As Relational Anchor Dimension . . . . . . . . . . . 235
167.Fact Table with Market As Relational Anchor Dimension . . . . . . . . . . . . 236
168.The Expanded TBC Sales Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
169.Final Expanded TBC Sales Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
xii Visual Warehouse & DB2 OLAP Server
170.Final SQL for Data Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
171.Calculation Order for an Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
172.Reviewing Run-Time Information for the Expanded TBC Cube . . . . . . . 246
173.Reviewing Run-Time Information for Expanded TBC Cube (continued) 247
174.Reviewing Run-Time Information for Expanded TBC Cube (continued) 247
175.A Typical Business Intelligence Security Architecture. . . . . . . . . . . . . . . 261
176.Opening the DB2 OLAP Server Add-in . . . . . . . . . . . . . . . . . . . . . . . . . 270
177.Connecting to the DB2 OLAP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
178.Selecting the Required Application and Spreadsheet . . . . . . . . . . . . . . 271
179.Result of Accessing the Cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
180.Drilling Down into the Cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
181.Drilling Down into the Cube, Using the Alt Key . . . . . . . . . . . . . . . . . . . 274
182.Setting the DB2 OLAP Server Spreadsheet Add-in Options . . . . . . . . . 275
183.Locking the Displayed Data before Updating the Cube . . . . . . . . . . . . . 277
184.Sending the Updates to the Cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
185.Calculating the Cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
186.Selecting the Calculation Script for Calculating the Updated Cube . . . . 280
187.Calculation of Updated Cube Completed . . . . . . . . . . . . . . . . . . . . . . . . 280
188.Retrieving the Updated and Calculated Data from the Updated Cube . . 281
189.The TBC Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
190.Creating a UDA for Member East in the Market Hierarchy . . . . . . . . . . . 284
191.Modified TBC Model with UDAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
192.Load Rule for Building the Market Dimension Dynamically. . . . . . . . . . . 286
193.Defining the Field in the Input Data As a UDA . . . . . . . . . . . . . . . . . . . . 286
194.Load Rule for Creating a UDA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
195.How to Reach the Member Selection Panel . . . . . . . . . . . . . . . . . . . . . . 288
196.Viewing Member Information from the Member Selection Panel . . . . . . 289
197.Viewing UDAs from the Member Information Panel . . . . . . . . . . . . . . . . 289
198.Selecting a Subset of Members, Using the Subset Dialog . . . . . . . . . . . 290
199.Preview of the Members Selected. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
200.Associate Selected Members with the Year Dimension . . . . . . . . . . . . . 292
201.Flip the Sign Based on UDAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
202.The TBC Sales Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
203.Connecting to TBC Application from Excel . . . . . . . . . . . . . . . . . . . . . . . 298
204.Data Retrieved from the TBC Sales Model . . . . . . . . . . . . . . . . . . . . . . . 299
205.SQL Database Login Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
206.Creating a New SQL Drill-Through Profile . . . . . . . . . . . . . . . . . . . . . . . 300
207.Profile Editor Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
208.Sample Data from ALL_CUSTOMERS Table . . . . . . . . . . . . . . . . . . . . . 302
209.Sample Data from HISTORY_OF_ORDERS Table . . . . . . . . . . . . . . . . 302
210.Creating SQL Generation Rule for Market Dimension . . . . . . . . . . . . . . 303
211.SQL Generation Rules for the Product Dimension . . . . . . . . . . . . . . . . . 303
212.SQL Generation Rules for the Year Dimension . . . . . . . . . . . . . . . . . . . 305
xiii
213.Defined Columns Page of the Profile Editor . . . . . . . . . . . . . . . . . . . . . . 306
214.Defining the Column List for SQL Drill-Through . . . . . . . . . . . . . . . . . . . 307
215.Defining Table Links for SQL Drill-Through. . . . . . . . . . . . . . . . . . . . . . . 308
216.Viewing the Generated SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
217.Generated SQL Query for SQL Drill-Through . . . . . . . . . . . . . . . . . . . . . 309
218.Setting the Output Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
219.SQL Drill-Through Output for the TBC Database . . . . . . . . . . . . . . . . . . 310
220.DB2 OLAP Server Table Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
221.SQL Queries on the Cube Catalog View . . . . . . . . . . . . . . . . . . . . . . . . . 317
222.SQL Queries on the Cube View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
223.SQL Queries on Dimension Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
224.Relational Cube (Star Schema) for the TBC Inventory Model. . . . . . . . . 324
225.SQL Queries on the Fact View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
226.SQL Queries on the Star View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
227.SQL to Perform an Aggregation Operation . . . . . . . . . . . . . . . . . . . . . . . 329
228.SQL Queries on the UDA View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
229.Traversing a Dimension Hierarchy, Using Recursive SQL . . . . . . . . . . . 333
230.Adding a New Product Group to the TBC Inventory Model. . . . . . . . . . . 334
231.Generating an Outline Change Log File, Using SQL . . . . . . . . . . . . . . . 335
232.Wired for OLAP Browser Client with Spreadsheet View . . . . . . . . . . . . . 339
233.Wired for OLAP Browser Client with Chart View . . . . . . . . . . . . . . . . . . . 339
xiv Visual Warehouse & DB2 OLAP Server
Tables
1. Source File Characteristics for TBC Source Files . . . . . . . . . . . . . . . . . . 100
2. Valid Combinations for Warehouse Business Views . . . . . . . . . . . . . . . . 111
3. Valid Combinations for Subject Business Views . . . . . . . . . . . . . . . . . . . 111
4. List of Business Views Created for the TBC Sales Data Mart . . . . . . . . . 113
5. Parameters for Essbase Load VWP (ESSDATA3) . . . . . . . . . . . . . . . . . 126
6. Essbase Mathematical Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7. Essbase Index Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8. Essbase Financial Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9. Essbase Macro Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
10. Essbase Boolean Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
11. Essbase Data Declaration Commands . . . . . . . . . . . . . . . . . . . . . . . . . . 142
12. Essbase Control Flow Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
13. Essbase Computation Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
14. Two-Pass Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
15. Calculation Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
16. Tables Created When a DB2 OLAP Database Is Created . . . . . . . . . . . 314
17. Structure of the Cube Catalog View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
18. Structure of the Cube View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
19. Structure of the Dimension View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
20. Structure of the Fact View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
21. Structure of the Star View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
22. Structure of the UDA View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
23. Structure of the Alias ID View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
24. Structure of the LRO View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
xv
xvi Visual Warehouse & DB2 OLAP Server
Preface
This redbook will help you understand the concepts of multidimensional
analysis and use the functions and features of IBM’s DB2 OLAP Server
together with IBM Visual Warehouse to build an end-to-end Business
Intelligence solution for online analytical processing (OLAP).
The book provides valuable information for readers who are new to and
readers who have experience with multidimensional data marts.
In Chapters 1 through 3 we establish the basic concepts and architectures
and discuss guidelines for planning a Business Intelligence project. In
Chapters 4 and 5, which are targeted to the novice, we provide step-by-step
descriptions of how to implement a simple initial dimensional model with DB2
OLAP Server and Visual Warehouse. In Chapters 6 and 7 we introduce
advanced topics, such as calculation options and partitioning techniques,
which enable more complex and scalable models. In Chapters 8 through 11
we cover the topics that are important for managing a data mart environment
for OLAP, such as updates and maintenance, performance and tuning
considerations, problem determination, and security. In Chapters 12 through
16 we deal with aspects related to accessing the information stored in a
multidimensional data mart. We cover spreadsheet access, SQL access, and
Web-based analysis capabilities.
The Team That Wrote This Redbook
This redbook was produced by a team of specialists from around the world
working at the International Technical Support Organization, San Jose
Center.
Thomas Groh is a Business Intelligence Specialist at the International
Technical Support Organization, San Jose Center. He writes extensively and
teaches IBM classes worldwide on all areas of Business Intelligence and
Data Warehousing. Before joining the ITSO in 1998, Thomas worked for IBM
Professional Services in Vienna, Austria as an IT Architect. His technical
background includes end-to-end design for traditional online transaction
processing systems in mainframe and client/server environments, as well as
designing, building, and managing data warehouse solutions ranging from
small data marts based on Windows NT servers to very large database
implementations (> 1 TB) using massive parallel processing platforms.
Ann Valencic is a Senior Systems Specialist in the Business Intelligence
group in Australia. She has 13 years of experience in database and data
© Copyright IBM Corp. 1998 xvii
warehousing and has been involved in a number of DB2 OLAP Server and
Visual Warehouse customer projects. Ann’s areas of expertise include
database design and performance tuning. Previous redbooks include Data
Modeling Techniques for Data Warehouse, and the DB2 Performance Monitor
Usage Guide.
Bhanu Dhanaraj is a Software Analyst in the Business Intelligence group in
IBM, India. She has seven years of experience in database applications and
data warehousing and has been involved in the design and development of
R-OLAP based applications. Bhanu is currently involved in proof-of-concept
projects using Visual Warehouse and DB2 OLAP Server.
Hanspeter Furrer is a Project Leader and Consultant in Switzerland. He runs
his own company, Furrer Informatik AG, which is a BesTeam member of IBM.
Hanspeter focuses on the financial services industry in Switzerland, providing
solutions exploiting massive parallel computing technology (RS/6000 SP,
DB2 UDB Extended Enterprise Edition) and Web-based database access
(including Net.Data).
Karl-Heinz Scheible is a Senior Systems Specialist in the Data Management
Software Support group in IBM Germany. He has 10 years of experience in
data management products. Karl-Heinz has worked at IBM for 32 years. His
areas of expertise include UDB, DB2 Connect, and Business Intelligence
products. Karl-Heinz has been involved in the creation of several redbooks
about data management products and distributed databases.
Thanks to the following people for their invaluable contributions to this
project:
Gary Robinson
IBM Santa Teresa Lab
Bruce Hobbs
IBM San Jose
Maggie Cutler
IBM International Technical Support Organization, San Jose Center
Paul Wilms
IBM Santa Teresa Lab
Trevor Hughes
Hyperion Solutions, Sunnyvale California
xviii Visual Warehouse & DB2 OLAP Server
Comments Welcome
Your comments are important to us!
We want our redbooks to be as helpful as possible. Please send us your
comments about this or other redbooks in one of the following ways:
• Fax the evaluation form found in “ITSO Redbook Evaluation” on page 357
to the fax number shown on the form.
• Use the electronic evaluation form found on the Redbooks Web sites:
For Internet users http://www.redbooks.ibm.com/
For IBM Intranet users http://w3.itso.ibm.com
• Send us a note at the following address:
redbook@us.ibm.com
xix
xx Visual Warehouse & DB2 OLAP Server
Part 1. Building an OLAP Data Mart
In part 1 we cover the topics to consider when planning and building a data
mart for multidimensional analysis.
We investigate the driving factors that lead to the launching of a Business
Intelligence project. We take a close look at the characteristics of Online
Analytical Processing (OLAP) to find out why so many companies across all
industries have adopted it. After discussing the issues related to the planning
of a Business Intelligence project, we give an overview of the available
architectures to enable you to position the OLAP tools available in the
marketplace and to select the appropriate architecture for your Business
Intelligence solution. Then we lead you through the actual implementation of
a multidimensional data mart, using IBM DB2 OLAP Server and Visual
Warehouse. The implementation is based on a simple sales analysis model
that is easy to understand. After establishing the basic model, we cover more
complex techniques and functions available with DB2 OLAP Server, such as
calculation and partitioning options that enable growth and scalability.
© Copyright IBM Corp. 1998 1
2 Visual Warehouse & DB2 OLAP Server
Chapter 1. Introduction
In this chapter we investigate why companies across all industries have
started to plan or implement Business Intelligence solutions. We identify the
business value that gives the adopters of OLAP technologies a competitive
advantage. We describe the major characteristics of a Business Intelligence
solution for OLAP.
1.1 Driving Factors for Business Intelligence
Today’s economy is characterized by open, deregulated markets and global
competition on the one hand and by the recognition of the diversity of the
customer population in search of new market opportunities on the other hand.
To be successful, companies have to adapt their products and processes
quickly and frequently to make them suitable and convenient for specific
target customer groups or individuals. They also have to tighten their control
over the efficiency and profitability of their business in order to contain their
costs and have their operations quickly respond to new market requirements.
In search of competitive advantage, companies overwhelmed by the
increasing amounts of data available from transactions with customers and
events created during internal business processes are increasingly deploying
multidimensional analysis techniques. They use Business Intelligence and
especially OLAP technology for two reasons:
• To better understand their customers
• To understand how business processes can be improved to effectively
support their needs
OLAP has finally evolved to play a strategic and vital role within the overall
information technology framework of companies across all industries.
The utilization of OLAP technology is no longer limited to a small elite of early
adopters and Fortune 500 companies. The mandate for customer centricity
and the increasing affordability of the supporting technologies have
contributed to the widespread use of Business Intelligence and OLAP
solutions as a key information technology within companies of all sizes.
In the financial services industry, for example, banks and insurance
companies are competing for the same customers on a global scale because
of ongoing deregulations and opening markets. Financial services such as
loans have to be increasingly flexible and targeted to the needs of very
specific groups of customers. In addition new channels such as television,
© Copyright IBM Corp. 1998 3
automated teller machines (ATMs), Call Centers, and the Internet are used to
address and service larger numbers of potential customers directly without
the need for brokers, agents, or large regional organizations and structures.
This direct interaction enables financial institutions to record and analyze
more information about their customers’ demographics, behavior, and buying
patterns. In this highly competitive market, it is key to determine which
customers are the most profitable, which have the potential to become more
profitable, and which are unprofitable or high risk. OLAP solutions are
considered very useful for determining such factors.
In the manufacturing and retail industry, companies are shifting their
operations toward consumer driven supply-chain handling. Customers can
customize the products they intend to purchase to their specific requirements,
submit the order directly to the manufacturer or retailer, and track the
availability and status of their orders. The suppliers and distributors, in turn,
are tightly integrated into the business processes of the manufacturer or
retailer (traditional roles are getting more and more blurry in this business
area). They can view the orders, the inventory levels and the schedules in
order to supply and ship the parts just in time and to adjust their own
production cycles and business processes to the demand. Forecasting is a
necessity in this type of tightly integrated, networked environment and is very
often accomplished with the help of OLAP analysis.
In fact the utilization of internet technologies together with Business
Intelligence solutions is reshaping the way companies do their business
today and in the future.
Be it marketing and sales analysis, budgeting, financial reporting and
consolidation, management reporting or profitability analysis, OLAP
technology is growing in importance in all of these application areas.
As markets tend to be more and more competitive, companies are forced to
streamline their organizational structures, flatten their hierarchies, and
empower a greater number of their employees to make informed decisions.
Therefore OLAP tools are being used today on a daily basis by a dramatically
larger user community than they were just a few years ago.
Consequently Business Intelligence environments have to comply with the
same requirements as traditional online transaction processing (OLTP)
systems in terms of scalability, manageability, availability, and security.
Due to the high attention paid to OLAP in almost all industries, the market for
OLAP solutions has become one of the fastest growing markets in the
4 Visual Warehouse & DB2 OLAP Server
information technology (IT) business, with year to year growth rates of about
40% and an estimated volume of six billion U.S. dollars by the year 2001.
The high market potential has attracted a large number of vendors (30+) to
supply OLAP solutions. Coming from totally different areas within the IT
industry, it is no surprise that OLAP vendors have chosen different
approaches and architectures for their solutions. We sort out the fundamental
differences between the different OLAP architectures in Chapter 3, “Selecting
the Appropriate Architecture” on page 19, but first we want to take a closer
look at the characteristics of an OLAP solution.
1.2 What Is OLAP and Why Is It So Successful?
The term online analytical processing was initially coined by E. F. Codd in
1993. In his white paper for Arbor Software entitled "Providing OLAP to User
Analysts: An IT Mandate," Codd established 12 rules that are the foundation
of OLAP today.
The term was meant to distinguish clearly the new paradigm from traditional
OLTP. Whereas OLTP focuses on handling and storing all of the information
in an operational system needed to run the daily business quickly and
efficiently, OLAP focuses on analyzing corporate data from different
viewpoints and levels of granularity and aggregation, as well as analyzing
historical and projected data for the purpose of decision making.
The 12 rules for characterizing OLAP products are:
1. Multidimensional conceptual view
2. Transparency
3. Accessibility
4. Consistent reporting performance
5. Client/server architecture
6. Generic dimensionality
7. Dynamic sparse matrix handling
8. Multiuser support
9. Unrestricted cross-dimensional operations
10.Intuitive data manipulation
11.Flexible reporting
12.Unlimited dimensions and aggregation levels
Introduction 5
Let’s have a closer look at these rules to better understand them.
1. Multidimensional Conceptual View
Business analysts naturally view the enterprise’s universe in a
multidimensional way. For example, they look at the revenues of sales
related to customers, sales areas, products or product groups, and certain
time periods. Accordingly the multidimensional paradigm tries to resemble
this perception of the business as closely as possible in the models. This
enables business analysts to understand, navigate, and manipulate OLAP
models more easily and intuitively than they can with traditional database
models (for example, entity-relationship models). In addition to the slice and
dice capabilities that enable users to navigate to, select, and focus on
specific parts of the information in the multidimensional model, hierarchical
structures are provided, allowing users to analyze the key business measures
at different levels of detail and aggregation. These operations are known as
drill down to see more detailed information or roll up to look at aggregated
information on an overview level.
2. Transparency
The multidimensional analysis capabilities should be provided seamlessly
regardless of the choice of the user-interface layer and the physical data
store of the solution. This gives users the flexibility to choose a presentation
tool that suits their needs (for example, a spreadsheet, a traditional reporting
tool, or a desktop OLAP tool) and therby benefit from multidimensional
analysis without having to deal with the specifics of the data storage
technology (for example, a file system or relational database).
3. Accessibility
This rule demands that OLAP solutions be able to access and combine
information from the various data sources of the enterprise and present all
the information in a multidimensional structure to users. This establishes
some sort of functionally rich middleware role for OLAP solutions, with the
OLAP engine sitting between the heterogeneous data sources and the OLAP
presentation layer.
4. Consistent Reporting Performance
Consistent reporting performance is key to maintaining the ease-of-use and
speed of thought navigation and analysis capabilities required in bringing
OLAP to the end user. This rule should also hold for large databases and
increasing numbers of dimensions. To ensure consistent, linear response
times, OLAP solutions usually use precalculations during the population of
the multidimensional model. This has an impact on the time needed to
populate the model and the storage space used. Usually a trade-off has to be
6 Visual Warehouse & DB2 OLAP Server
made between response time, time needed for population, and storage
requirements.
5. Client/Server Architecture
The client/server architecture is a key enabler of scalability in terms of model
size, number of users, and workload. Especially with Business Intelligence
solutions like data marts, which are iterative and subject to growth, this rule is
very important. In order to scale the solution according to the evolving
business requirements, the architecture of the OLAP solution should provide
components (or building blocks) that support well-defined interfaces and can
be placed on different computing platforms. The more advanced OLAP
solutions support many different client products (including other vendors’
products) through a standard application programming interface (API) to
handle the presentation and navigation. Thin-client architectures, like
Web-enabled OLAP solutions, also show the benefits of good client/server
architectures in terms of deployability and easy systems management,
especially for solutions with large user communities.
The client/server architecture is not limited, however, to the differentiation of
the presentation layer from the processing and calculation layer. It also
makes a lot of sense when applied to the calculation engine itself. Good
OLAP architectures, for example, allow for seamless partitioning of the
multidimensional model into more than one physical model, to utilize the
capacity of multiple processors to satisfy user requests and to load and
calculate the model in a client/server computing paradigm.
6. Generic Dimensionality
With this rule, Codd postulates that each dimension should be equivalent in
both its structure and operational capabilities. However, he does allow
additional operational capabilities to be granted to selected dimensions (for
example, the Time dimension). Thus OLAP tools can be used for multiple
purposes and many different application areas, not just for a specific
business subject area.
7. Dynamic Sparse Matrix Handling
It should be possible to adjust the physical schema of an OLAP solution
based on the actual distribution of the data values in the input data. If the
physical layout of the OLAP data store cannot be altered in terms of density
and sparseness (missing cells as a percentage of possible cells) of the data
to be analyzed, models that appear to be practical, based on the number of
dimensions and consolidation paths, or the size of the enterprise source data,
may, in practice, be needlessly large and hopelessly slow. This characteristic
also has an impact on the performance and access speed that can be
achieved.
Introduction 7
8. Multiuser Support
To be regarded as strategic, OLAP tools must provide concurrent access
(retrieval and update), integrity, and security in a multiuser environment.
9. Unrestricted Cross-Dimensional Operations
To enable more complex calculations (as needed, for example, in profitability
analysis applications) the rule requires OLAP solutions to provide all forms of
calculations across all dimensions, not just the Measures dimension.
10. Intuitive Data Manipulation
The highly intuitive presentation layer, which enables speed of thought
navigation through the multidimensional model without the need for queries
or complex operations, is one of the key strengths of OLAP solutions.
11. Flexible Reporting
This rules states that the reporting functions must support rows, columns, and
page headings capable of containing and displaying any number of
dimensions, in any combination users require to enable easy visual
comparison during analysis.
12. Unlimited Dimensions and Aggregation Levels
This rule was probably intended to provide maximum flexibility in terms of the
models that can be built, and it seems to emphasize general-purpose OLAP
solutions. In practice, however, few applications need more than 8 to 10
dimensions in a single model. Our recommendation is to design for smaller,
dependent models that can be linked together according to common
dimensions (using the drill across operation). This approach reduces
sparseness and makes the solutions more manageable and scalable.
These rules build a sound foundation of the features and functions needed to
provide OLAP capabilities. Business analysts have adopted this technology
on such a broad scale because it offers enormous analysis flexibility and
provides a data model that so closely resembles their perception of the
real-world business environment that they can understand, navigate, and
manipulate the model in an intuitive way. This is the single most important
characteristic that has contributed to the continuing success of OLAP
Business Intelligence solutions across all industries.
8 Visual Warehouse & DB2 OLAP Server
Chapter 2. Planning a Business Intelligence Project
At first glance, one might expect that Business Intelligence projects are very
similar to any other IT project, with the typical phases of requirements
analysis, design, development, test, rollout, production, and ongoing
maintenance. Basically this is true, because all of these phases are also
found in the lifecycle of Business Intelligence projects. However, there are
some characteristics that distinguish Business Intelligence projects from
other IT projects.
First of all, it is very important to have the business departments involved in
the project because business analysts will directly access the data models,
without an application layer that hides the complexity of the model (as is the
case in traditional OLTP systems). To enable business analysts to navigate
and manipulate the model, the structure of the data mart solution must be
closely related to their perception of the business objects and processes. This
requires that groups of business specialists and IT specialists work together.
Cultural issues between the business and IT departments may influence the
project more than is usually the case in other IT projects.
The many different skills and resources required may be widely dispersed
throughout the company. Some skills may not be available within the
company or may be limited and have to be brought in from outside
(consultants and technical and tool specialists) because of the strong
involvement of the business side of the house and the fact that a Business
Intelligence project is, from a technical perspective, comparable to a systems
integration project. Typically more than one platform is involved, and many
vendors and tools, multiple interfaces, integration with legacy systems, and
client/server and Web technologies have to be dealt with. The appropriate
selection and coordination of the project team are key to the success of a
Business Intelligence project.
The requirements for a Business Intelligence project are usually fuzzy and
incomplete. The potential for additional requirements that occur way back in
the development lifecycle is very high, because users will recognize the
capabilities of the technology when they are presented with and start working
with the first preliminary models. That is why the development and delivery
process for Business Intelligence solutions has to be iterative and designed
for change. Each individual business subject area should be targeted
separately, to shorten the delivery cycle and provide business value to the
company within a meaningful time frame. Plan for the delivery of business
results as quick as possible (for example, using a rapid development
approach in a pilot phase) and define the scope of the solution to fit into a
© Copyright IBM Corp. 1998 9
time frame, not longer than six months. Starting with the pilot phase, work in
short iteration cycles, to continuously enhance the solution and deliver
business value to the users throughout the project, and align the solution as
close as possible to the business. Then pick the next business subject area
and, again, scope the project for not more than six months. Do not try to
incorporate all aspects of the business into one model. For a more detailed
discussion of this topic see 2.2, “The Development Process” on page 14.
Business Intelligence projects tend to be cross-departmental. Therefore even
if only a specific business subject area is covered by the project, the business
definitions and business rules must be standardized to be understood and
valid on an enterprise level and to ensure consistency and enable reuse. This
characteristic could lead to lengthy discussions on how the business is
looked at and interpreted among different business departments and could
have an impact on the way the performance of the company is measured.
The management of the Business Intelligence project must ensure that there
is at least an official and agreed on definition for those measurements that
are part of the deliverables (that is, data models, reports, and metadata
catalog).
Business Intelligence solutions have to consolidate data from a lot of different
sources from different lines of business throughout the company. The
planning for the population subsystem that maps and transforms the data into
the corporate-wide context needed in a Business Intelligence environment
must consider data quality issues, which are usually discovered during this
process. Resolving data quality issues and ensuring that only 100% correct,
meaningful, and unambiguous data is delivered to the analysts can be a very
complex and time-consuming process. However, it is of utmost importance to
the success of the Business Intelligence project that the data in the analysis
environment is correct, clean, validated and trusted by the business analysts.
A major reason for the failure of Business Intelligence projects is the lack of
trust in the analysis results due to data quality problems or ambiguous
interpretations.
2.1 Who Is Needed for the Project?
In this section we consider the roles and skill profiles needed for a successful
Business Intelligence project. We describe the roles of the business and
development project groups only. Not all of the project members we describe
are full-time members. Some of them, typically the Business Project Leader
and the Business Subject Area Specialist, are part-time members. The
number of people needed to accomplish a task depends on the organization
and scope of the Business Intelligence project. There is no one-to-one
10 Visual Warehouse & DB2 OLAP Server
relationship between the role description and the project members. Some
project roles can be filled by one person, whereas others need to be filled by
more than one person.
2.1.1 Business Project Group
The Business Project Group is mainly concerned with the business value of
the solution. The members of this group drive the project, because they are
the ultimate consumers of the information delivered by the new solution. The
business project group defines the requirements and the scope of the project.
It is responsible for the alignment of the solution to the business goals of the
company.
2.1.1.1 Sponsor
In general a Sponsor is needed for all types of projects. But in a Business
Intelligence project we particularly need a Sponsor from a business
department (for example, the Chief Financial Officer). The Sponsor plays a
very important role and must have the trust of executive management. He or
she has the business need for the new solution and the financial
responsibility for the project. The Sponsor is also involved in making the key
scoping decisions and supporting them throughout the project. He or she has
to uphold the vision related to the new solution and reinforce and encourage
the user community within the company. It is extremely important that the
project team has a direct communication path to the Sponsor.
In large Business Intelligence projects there is also a need for an IT Sponsor,
who is responsible for those parts of the project budget that are outside the
scope of the Sponsor from the business department (especially for hardware
and software installation, connectivity, and operations).
The Sponsor usually nominates a Business Project Leader who represents
the business community and works closely with the Technical Project
Manager.
2.1.1.2 Business Project Leader
The Business Project Leader should be a person from the line of business
organization. He or she will also use the new solution and should be
empowered and able to make detailed decisions from a business perspective
during the project. The Business Project Leader should have a solid
understanding of the business requirements. He or she works closely with the
Technical Project Manager.
Planning a Business Intelligence Project 11
2.1.1.3 End User
End user representatives with business responsibility will work with the
Business Intelligence solution and should therefore be part of the project as
well. It is important to find end users who are open to new technologies. They
should be able to share information about their detailed business processes
and needs.
2.1.2 Development Project Group
The Development Project Group deals with the delivery of the Business
Intelligence solution. This group works closely with the Business Project
Group to map the business requirements to a technically feasible and
manageable solution.
2.1.2.1 Technical Project Manager
The Technical Project Manager should have experience with Business
Intelligence or Decision Support projects. Heor she should be able to staff the
project with qualified project members and build a team that can work
together. This is critical to the success or failure of the project, because a
Business Intelligence project needs a lot of different skills and a lot of
different people who speak different business languages.
The Technical Project Manager is responsible for such tasks as coordinating
resources, managing the project activities, tracking the project status, and
setting up a communication structure for the project.
The Technical Project Manager should have strong communication skills and
a technical background. He or she should know which processes are
necessary in an end-to-end solution. The Technical Project Manager must be
able to establish the link between the technical and the business part of the
project and navigate through the political environment of the organization.
2.1.2.2 Business Intelligence Solution Architect
The Business Intelligence Solution Architect is in charge of the technical
solution. He or she is knowledgeable about the architectures and products
available to design the solution. He or she has to ensure that the different
platforms, tools, and products can be integrated in an end-to-end solution
that fits the requirements, is manageable, and can grow with increasing
business demands. The Business Intelligence Solution Architect is involved
in the design of all major components of the solution (that is, the data staging
and population subsystem, databases and connectivity, information catalog,
warehouse management subsystem, analysis applications and tools, and
archiving solution). He or she drives the work of the various Platform and Tool
Specialists.
12 Visual Warehouse & DB2 OLAP Server
2.1.2.3 Business Subject Area Specialist
The Business Subject Area Specialist should have knowledge of the
business processes, applications, and data related to the specific business
problem that the solution addresses. He or she also should know who is
responsible for the definition of a key business measure, or who can decide
which definition is correct. Becuase the Business Intelligence Solution
Architect is also responsible for the quality of the information, he or she is
heavily involved in validating the information provided by the solution. The
Business Subject Area Specialist has the trust of the Business Project Leader
and the Sponsor and his or her opinions are very important to them. This role
is usually a very critical resource in the project, because it has to be filled
from among the few key business analysts in the company, who cannot
withdraw completely from their day-to-day business duties for the duration of
the project.
2.1.2.4 Database Administrator
In cooperation with the Business Subject Area Specialist, the Database
Administrator knows where to find and how to interpret the source data. He
or she knows the structure of the data and the data relationships. The
Database Administrator provides access to the source and target data. He
or she is usually also responsible for security. The Database Administrator
is the only person who can handle security for the Business Intelligence
environment from a single point of control.
The Database Administrator should also be involved in validating the data
model for the new Business Intelligence solution.
2.1.2.5 Platform Specialists
Usually more than one Platform Specialist is needed in a Business
Intelligence project. For each legacy system (for example, OS/390 hosts,
AS/400, and/or UNIX systems) that acts as a source for the Business
Intelligence solution, a Specialist will be needed to provide access and
connectivity. If the Business Intelligence environment will be multitiered (for
example, UNIX massive parallel processing (MPP) platforms or symmetrical
multiprocessing (SMP) servers, and Windows NT departmental systems)
Platform Specialists are needed as well.
Usually, people from the IT department of the company are involved, as far as
the legacy environment is concerned. Due to the day-to-day operational
duties of these people, the Technical Project Manager has to make sure to
plan for and inform them as early as possible.
Planning a Business Intelligence Project 13
2.1.2.6 Tool Specialists
Many different tools are usually needed to build a Business Intelligence
solution, from extraction, transformation, and cleansing tools to data access
middleware for the population subsystem to data warehouse management
and analysis tools for standard query and reporting, OLAP analysis, or data
mining. The Tool Specialists must know how to install, implement, and tune
these tools.
Very often, the Tools Specialists are provided by the vendors of the tools.
Services packages offered by the vendors could also include an education
package to transfer the necessary skills to the future administrators and/or
users of the tools.
2.1.2.7 Extract Programmers
It is often necessary to plan for an additional Extract Programmer, even if
extraction, transformation, and replication tools are going to be used,
because the tool may not support a data source or it may not be capable of
the complex transformations needed to extract certain business rules hidden
in some of the data items. Perhaps a temporary (prototype) solution is
needed to allow the validation and quality assessment of the extracted source
information by end users in the context of the solution. But be careful with this
approach! You could end up with a maintenance nightmare, if the programs
are not properly designed and documented, for example, managed like
traditional application development projects.
2.2 The Development Process
Basically, a Business Intelligence project has to deal with three major topics:
• Infrastructure
• Data
• Application
The Infrastructure topic includes all the tasks necessary to provide the
technical basis for the Business Intelligence environment. This includes the
installation and implementation of new hardware and software, the
connectivity between the legacy environment and the new Business
Intelligence environment on a network as well as on a database level, and the
implementation of a population subsystem, an administration subsystem, and
a management subsystem. Establishing the infrastructure for the first
Business Intelligence solution is time consuming, but with the selection of
scalable hardware and software components, the effort will decrease
dramatically for the next project or delivery cycle.
14 Visual Warehouse & DB2 OLAP Server
The Data topic deals with data access, mapping, derivation, transformation,
and aggregation according to the requirements and business rules, as well as
with the proper definition of the data items in business terms (metadata). It
also contains the tasks necessary to ensure the consistency and quality of
the information being transferred to the Business Intelligence environment.
The effort for the tasks involved in the data topic should decrease with each
new Business Intelligence project, depending on the amount of data that can
be reused from previous projects (or iterations).
The Application topic includes the gathering of the business requirements,
the design of the model, and the implementation, visualization, and
publication of the analysis results in terms of, for example, queries, reports,
and charts. The effort needed for the tasks within the application topic is
heavily dependent on the selected scope of the project.
The scope of a Business Intelligence project should be selected in such a
way that a complete solution (that is, infrastructure, data, and application) for
the business analysis domain selected can be offered and valuable results
can be delivered to the business analysts within a reasonable timeframe (no
longer than six months).
The Business Intelligence solution is then enhanced in an evolutionary and
iterative way, as shown in Figure 1 on page 16.
Planning a Business Intelligence Project 15
1. Business 2. Business 3. Business
Subject Area Subject Area Subject Area
Application ...
Application
Application Data
Data
Infrastructure
Data Infrastructure
Infrastructure reuse and enhance
infrastructure
and common data from
previous delivery cycle
Time
6 months 6 months 6 months
Figure 1. Iterative Data Mart Development Approach
As you can see in Figure 1, each consecutive delivery cycle leaves more
room for application-related efforts by reusing as much of the infrastructure
and data of the previous cycles as possible.
To learn more about an architecture that supports this iterative development
process, refer to 3.3, “The End-to-End Architecture of a Business Intelligence
Solution” on page 27.
2.3 Success Factors for a Business Intelligence Solution
In this section we summarize the success factors we consider essential for
Business Intelligence projects, in addition to the technical issues and
challenges:
• Scope the project to be able to deliver within at least six months.
• Select a specific business subject area; do not try to solve all business
requirements within one project.
16 Visual Warehouse & DB2 OLAP Server
• Find a sponsor from the upper management of the business side of the
company.
• Involve the sponsor throughout the project.
• Establish a sound information and communication structure that includes
business and technical staff inside and outside the project.
• Define the contents and type of the deliverables of the project as early and
in as much detail as possible.
• Together with the end users validate the results of the analysis phase (that
is, the initial dimensional models) against the deliverables definition.
• Deploy the solution quickly to a limited audience and iterate development.
• Establish commonly agreed on business definitions for all items within the
scope of the project.
• Validate the quality and correctness of the information before making it
available to the end-user community.
• Keep the end users involved and informed throughout the project.
• Be prepared for political and cultural obstacles between business
departments or between business and IT departments.
Planning a Business Intelligence Project 17
18 Visual Warehouse & DB2 OLAP Server
Chapter 3. Selecting the Appropriate Architecture
In this chapter we provide an architectural framework for OLAP solutions,
which enables you to understand the advantages and disadvantages related
to each of the architectures available in the marketplace. We also discuss the
architecture of IBM DB2 OLAP Server and Hyperion Essbase in more detail
and position the OLAP components within an end-to-end architecture of a
Business Intelligence solution for multidimensional analysis.
3.1 Available Architectures for OLAP Data Marts
In general the architecture of an OLAP solution consists of the building blocks
depicted in Figure 2.
Presentation Application OLAP Data
Component Component Engine Store
Figure 2. Architecture Building Blocks of an OLAP Solution
The Presentation Component provides a rich set of easy-to-use visualization,
navigation, and reporting capabilities that have contributed to the success of
OLAP tools in the marketplace.
The OLAP Engine is responsible for the multidimensional representation of
the model, the complex calculations, and the aggregations along the paths of
the dimension hierarchies.
The Data Store takes care of the persistency of the data represented in the
multidimensional model. It arranges the data elements of the model in a
physical structure for fast and efficient storage and retrieval.
Some solutions include an additional layer, the Application Component, which
typically resides between the presentation component and the
multidimensional calculation component. The application component
provides business logic that is targeted to a specific subject area (for
example, financial consolidation, marketing and sales support) and makes
use of the general functions provided by the multidimensional calculation
component. The added value of the vertical industry applications is usually
provided as optional modules or by third-party vendors (for example,
Hyperion with Enterprise and Pillar, or Oracle Financials).
© Copyright IBM Corp. 1998 19
The major differences among OLAP architectures are related to the different
options for placing the components within a client/server infrastructure and to
the different storage technologies used for the data store component.
If relational databases are leveraged as the core component for storing,
managing, and calculating the data, the solutions are known as R-OLAP
(Relational OLAP) tools. Solutions centered around a powerful, functionally
rich OLAP engine implementation, usually using specialized files based on
the file system of the underlying operating system, are called M-OLAP
(Multidimensional OLAP) tools.
Some solutions combine the two approaches seamlessly to store parts of the
model (for example, higher aggregated data) in fast and efficient proprietary
files, and to manage other parts (for example, the detailed data) in relational
databases. These solutions are known as H-OLAP or Hybrid OLAP.
Examples of hybrid OLAP solutions are Seagate Holos, and Microsoft SQL
Server OLAP Services.
Single-tier architectures place all components, that is, presentation,
calculation engine, and data store, on the client. Usually these
implementations use the client’s file system for making the data persistent.
These architectures store the data in an efficient, proprietary, very often
compressed structure, optimized for quick access and performance. Products
in this category are known as Desktop OLAP. The most obvious restriction of
this architecture is the capacity of the client machine in terms of available
storage, memory, and processing power. All the data has to be downloaded to
the client machine and calculated and stored locally to be available for
multidimensional analysis. This puts some constraints on the size of the
models these tools can handle efficiently. Usually the models are in the tens
to low hundreds of megabytes. Another drawback of this architecture is the
weak functions available for managing multiuser environments in terms of
data sharing, security, and writeback of data. Vendors providing desktop
OLAP tools include Cognos with PowerPlay, Business Objects, Brio and
Andyne, to name the major players in this area. Many vendors in this area
have also enabled their tools to work as clients utilizing the provided APIs of
the leading server-based OLAP engines (for example, of Hyperion Essbase
and DB2 OLAP Server). Recently, most of the vendors named above have
moved their presentation component to the Web, providing either HTML
based or more powerful pure Java based implementations (for example,
Hyperion with Wired for OLAP).
Two-tier solutions either combine presentation and calculation on the client
(fat client), while storing the data on a server, or follow the thin-client
paradigm, hosting only the presentation functions on the client and
20 Visual Warehouse & DB2 OLAP Server
concentrating the multidimensional calculation and data storage on the
server.
The drawback of the fat-client architecture is that all of the data has to travel
over the network in order to be calculated and aggregated on the client.
Therefore even if only a summary level result is requested, consisting of a
few numbers, all of the detailed data needed to compute the result has to be
transferred. Examples of products implementing this architecture are Oracle
Discoverer and Cognos PowerPlay Server Edition. Both use relational
databases on the server to store the data.
The more popular and efficient two-tier architecture is the thin-client
approach, where the client communicates with a powerful multidimensional
engine that is combined with an efficient data store, which is completely
managed and controlled by the calculation engine based on the model
definition. With this approach, only small amounts of data have to travel over
the network between server and client, which makes the solutions usually
more scalable and accessible by a larger number of end users. The solutions
often provide a standard API for communication between the client and
server layer. This also enables third parties to provide clients for accessing
the OLAP server functions. The architecture is perfectly suited for Web-based
solutions due to the thin client paradigm. In contrast to three-tier solutions
(described below), the complexity of administering and managing the data
store is totally handled by the server engine based on the metadata
definitions for the multidimensional model. Usually the calculation capabilities
and options are much richer than those provided by the other architectures.
Hyperion Essbase, for example, implements this architecture.
A more detailed description of Hyperion Essbase and how it relates to DB2
OLAP Server can be found in 3.2, “Architecture and Concepts of DB2 OLAP
Server and Essbase” on page 22.
Three-tier architectures take advantage of a standard data management
subsystem (a relational database) to implement the data store component.
The calculation layer depends heavily on the relational database, too,
mapping the requests to complex SQL. In addition to the function of providing
the multidimensional capabilities, a metadata layer is used to map the
multidimensional view to the physical layout of the relational database to
generate the complex SQL statements for the calculation. The obvious
advantage of this architecture is that it benefits from the strong data
management capabilities of standard relational database systems, which
enable products in this category to deal with large amounts of data (up to
several 100 GB). The drawback, however, is the enormous administrative
effort to design, build, update, and manage the database, which tends to be
Selecting the Appropriate Architecture 21
very complex in order to represent the multidimensional relationships and
aggregation levels. In addition to administering the database, the mapping
information for the calculation engine has to be maintained. Usually the
calculation capabilities of three-tier R-OLAP solutions are not as advanced as
those provided by the M-OLAP server solutions. An example of a three-tier
R-OLAP solution is MicroStrategy DSS Suite.
3.2 Architecture and Concepts of DB2 OLAP Server and Essbase
Because the Business Intelligence solution discussed throughout this book is
based on DB2 OLAP Server and Hyperion Essbase, we now want to take a
closer look at the architecture of these products.
Whereas Hyperion Essbase clearly belongs to the thin-client, two-tier,
M-OLAP category of available architectures, the positioning of DB2 OLAP
Server within the defined categories is not so straightforward.
DB2 OLAP Server’s core component is also the Essbase OLAP engine,
which, using the classification criteria we introduced in 3.1, “Available
Architectures for OLAP Data Marts” on page 19, puts DB2 OLAP server
closer to the M-OLAP than to the R-OLAP architecture.
Both products shield the complexity of managing the data storage (that is the
physical representation of the multidimensional model) from administrators
and users. The OLAP engine takes care of building and managing the actual
data store, based on the definition of the multidimensional model. Remember,
this is not the case for three-tier R-OLAP solutions, where the complex, time-
consuming, and error-prone tasks of managing the data store and defining
the mapping information for the multidimensional calculation component must
be done by the administrator!
Whereas Hyperion Essbase is a pure M-OLAP solution (that is, it utilizes the
file system of the underlying operating system to store the data), DB2 OLAP
Server uses a standard DB2 Universal Database (UDB) to manage the
persistent model, and is, in this sense, a M-OLAP solution with relational
storage.
DB2 OLAP Server structures the data in a relational Star-Schema within DB2
UDB. The table structure in a star-schema is especially well suited for
multidimensional analysis of relational data. A star-schema consists of a so-
called fact table, which holds the numerical measures and key figures of the
business subject area. The fact table relates these measures to a number of
dimension tables, which contain the information to establish a context for the
recorded facts and hold the aggregation hierarchies to enable drill-down and
22 Visual Warehouse & DB2 OLAP Server
roll-up operations. The elements of the dimensions are called members. They
provide the meaning for the numerical facts in the model and enable analysts
to select certain areas of interest by putting constraints on the attributes of
the dimension members. The members of a Market dimension are, for
example, all cities, where a company has sales outlets, such as Phoenix, San
Francisco, Los Angeles, San Diego. Members of the Market dimension
belonging to higher aggregation levels, such as states and sales regions are,
for example, California, Arizona and East, West. The members of these
higher levels summarize the facts related to the lower level members of the
dimension hierarchy.
If we take a closer look at how star-schemas can be built, we learn that there
are basically two approaches:
One is to design the central fact table as shown in Figure 3, associating a
single fact with foreign keys for all dimensions of the model.
Reference to Dimension Tables
Measures Scenario Year Customer Market Product Value
COGS Actual Jan 97 00001380 Denver 100-10 2340
COGS Actual Jan 97 00001380 Denver 100-20 1760
COGS Actual Jan 97 00001380 Denver ... ...
COGS Actual Jan 97 00001380 Denver 400-30 1890
Figure 3. Star-Schema with Primitive Fact Table
The second approach is to expand all members of one of the dimensions
within the fact table, as shown in Figure 4.
Selecting the Appropriate Architecture 23
Reference to Dimension Tables
Product (rel.anchor)
Measures Scenario Year Customer Market 100-10 100-20 ... 400-30
COGS Actual Jan 97 00001380 Denver 2340 1760 ... 1890
COGS Actual Feb 97 00001380 Denver 3100 1850 ... 1720
Figure 4. Star-Schema with Fact Table and Anchor Dimension
DB2 OLAP Server implements the second approach, which is obviously far
more efficient in terms of data storage and retrieval. The dimension that is
expanded within the fact table is called the relational anchor dimension.
For a detailed description of the structure of the DB2 OLAP Server tables and
schema, see Chapter 15, “Using SQL to Access the DB2 OLAP Server Data
Store” on page 313.
The files that are used by Hyperion Essbase directly store the Data Blocks,
which are used by the multidimensional calculation engine. Data blocks are
described in 9.2, “Block Sizes” on page 220.
As stated earlier in this section, both Hyperion Essbase and DB2 OLAP
Server share the exact same OLAP engine, originally developed by Arbor
Software Corporation. The OLAP engine is accessed through an API, the
Essbase API, which is also identical between the two products. The Essbase
API has established itself as a common standard in the Business Intelligence
industry. Many vendors (more than 30, including Cognos and Business
Objects, and Brio) have implemented this API to provide access from their
OLAP query products to Hyperion Essbase and of course to DB2 OLAP
Server.
The multidimensional calculation component retrieves and returns data
blocks from or to the data store component to satisfy the requests from the
client application. Therefore, in order to be able to use the same calculation
component for DB2 OLAP Server, the storage management component of
Hyperion Essbase, which deals with data blocks, had to be replaced by a
component to map the data blocks to the relational tables of the star-schema
and vice versa. This component is called the Relational Storage Manager
(RSM) in DB2 OLAP Server. It is also responsible for creating and
maintaining the tables, views, and indexes of the star-schema in DB2 UDB.
24 Visual Warehouse & DB2 OLAP Server
The architecture of DB2 OLAP Server combines the advantages of DB2 UDB,
the industrial-strength relational database management system, with the
power of the widely accepted Essbase OLAP engine without putting the
burden of building and maintaining the relational data store on the
administrator. Furthermore, it enables database administrators to manage the
multidimensional data in the same way they manage relational data in the
data warehouse and in operational systems, thereby leveraging the skills,
utilities, and procedures already in place. The solution also integrates
smoothly into the existing backup strategy for relational data without
additional effort (for example, using ADSM - ADSTAR Distributed Storage
Manager).
Because the multidimensional data is stored in relational tables, it is also very
easy to do analysis across traditional relational data marts or data
warehouses and OLAP data marts, enabling a staged transition to OLAP
technology without losing analysis capabilities or introducing unnecessary
duplication of data.
Many standard reporting requirements can be solved with traditional
SQL-based decision support tools by directly accessing the star-schema of
DB2 OLAP Server without having to go through the Essbase calculation
engine.
Typically, in large Business Intelligence environments, both Hyperion
Essbase and DB2 OLAP Server are utilized. Consistency of the models and
data and metadata exchange are managed by Visual Warehouse, which
integrates with both OLAP solutions.
Visual Warehouse is the backbone of the IBM Business Intelligence
framework. It is the integration platform for all other products that contribute
to the solution. It manages the exchange of information between the products
and controls their execution in the context of the end-to-end system. See 3.4,
“Visual Warehouse” on page 36 for a detailed description.
Figure 5 compares the architectures of Hyperion Essbase and DB2 OLAP
Server.
Selecting the Appropriate Architecture 25
Presentation and Application
Essbase Essbase Essbase Essbase
MS Excel, SQL Decision
Application Ready Adjustment
Web Support
Lotus 123 Front-End
Gateway Manager Module Applications
Tools
Components
Define/Load/Calculate
Essbase Essbase Hyperion Essbase
RDBMS
Spreadsheet Objects Integration Currency Management
Add-in Server Module Tools
Essbase API Essbase API
Engine
OLAP
Essbase OLAP Essbase OLAP
Engine Engine
Data Blocks
Data Blocks
Multidimensional Relational
Storage Manager Storage Manager
SQL
Binary Files
Data Store
NT or UNIX DB2 UDB
File System
relational
Star-Schema
Hyperion Essbase DB2 OLAP Server
Figure 5. Hyperion Essbase and DB2 OLAP Server Architectures
As you can see in Figure 5, the OLAP Server functions can be accessed by
Essbase ready front-end tools that implement the Essbase API (such as
Hyperion Wired for OLAP or Cognos PowerPlay) and seamlessly integrated
in standard spreadsheets, such as Microsoft Excel and Lotus 1-2-3. For a
detailed description of the Essbase Spreadsheet Add-in, refer to Chapter 12,
“OLAP Analysis Using the Spreadsheet Add-in” on page 269.
The administration interface for the OLAP Server is provided by the Essbase
Application Manager, which allows the definition of the models and the
initiation of the load and calculation processes.
Note that Figure 5 also includes the optional Essbase modules: Essbase
Adjustment Module, Essbase Currency Module, and Essbase Objects in the
26 Visual Warehouse & DB2 OLAP Server
Presentation and Application Components section. The Essbase Adjustment
Module provides a complete application for financial consolidation. The
Essbase Currency Module contains the functions for currency conversions.
The Essbase Objects provide ActiveX components for all major Essbase
functions, which can easily be integrated in custom applications,
implemented in, for example, Visual Basic or C++.
3.3 The End-to-End Architecture of a Business Intelligence Solution
To maximize the benefits of a Business Intelligence solution and to provide
for a stable, consistent, manageable, and scalable environment, it is not
enough to just implement an OLAP tool. The solution has to be integrated
with the existing legacy systems and provide a means of integrating and
automating as many of the processes involved, such as:
• Accessing heterogeneous source data
• Combining the information from different sources
• Cleansing the data
• Transforming and enriching the data
• Staging intermediate results
• Aggregating the data and calculating derived data
• Providing and managing history (for example, several editions of data)
• Building the dimension and hierarchy definitions dynamically
• Building database structures suitable for multidimensional analysis (for
example, star-schemes)
• Loading the data
• Storing the information in the target environment
• Calculating the multidimensional models
• Capturing metadata about all items, elements, and processes that can be
found in the Business Intelligence environment
• Ensuring consistency of the information within a data mart and across data
marts
• Scheduling the processes involved in building and/or updating the
Business Intelligence environment
• Controlling the execution of the processes involved and providing detailed
information about success or failure
• Providing failure recovery and restart mechanisms
Selecting the Appropriate Architecture 27
• Providing backup
• Providing security (user and access management)
• Periodic housekeeping of the databases (for example, REORG or
RUNSTATS)
• Publishing analysis results
As you can see from the list of tasks involved from an end-to-end perspective,
the scope of a Business Intelligence project is much broader than just
selecting an appropriate analysis tool.
3.3.1 The Architecture Building Blocks
Figure 6 on page 28 shows the architecture building blocks for an end-to-end
Business Intelligence solution. (The area inside the dashed line represents
the parts of the architecture covered by the OLAP solution, corresponding to
Figure 2 on page 19 and Figure 5 on page 26.)
Navigation / Visualization / Publishing
Administration / Automation / Operation / Control
Business Intelligence Applications
(Query / Reporting / DSS / OLAP / Data Mining)
Metadata Repository
Structured Query OLAP
Language Engine
Data Store
Backup/ Common Business Multidim.
Archive Data Subject Data
Data Data
Staged Data
Data Access / Transformation / Cleansing
Data Sources
Figure 6. Architecture Building Blocks of a Business Intelligence Solution
28 Visual Warehouse & DB2 OLAP Server
Data Sources
Data sources can be operational databases, historical data (usually on
tapes), external data (for example, from market research companies or from
the Web), or information from the already existing data warehouse
environment. The data sources can be relational databases of the line of
business applications, or, for example, from Enterprise Resource Planning
(ERP) applications such as SAP and Peoplesoft; hierarchical databases,
such as IMS; or files, such as VSAM or flat-files. They can reside on many
different platforms, such as OS/390, AS/400, AIX, UNIX, or Windows NT.
Data sources can contain structured information, such as tables or
spreadsheets, or unstructured information, such as plain text files or pictures
and other multimedia information. See 3.4.1, “Data Sources Supported” on
page 36, for the data sources supported by IBM Visual Warehouse.
Data Access, Transformation, Cleansing
To access all heterogeneous data sources and combine information from
them, a middleware layer is used. This middleware layer is usually also used
to extract or define the metadata, which describes the data sources. This
information is used in the mapping and transformation process. The data
access middleware also enables data from different sources to be combined.
This key feature enables the consolidation of information across different line
of business applications and business subject areas.
The transformation of the data usually involves code resolution with mapping
tables (for example, changing 0 to female and 1 to male in the gender field)
and the resolution of hidden business rules in data fields, such as account
numbers. Also the structure and relationships of the data are adjusted to the
analysis domain. Transformations occur throughout the population process,
usually in more than one step. In the early stages of the process, the
transformations are used more to consolidate the data from different sources,
whereas in the later stages the data is transformed to suit a specific analysis
problem and/or tool. Usually the data has to be staged inbetween.
In addition to transforming the data, it has to be cleansed in order to provide
valid and meaningful analysis results. Assuring data quality is one of the most
important tasks of a Business Intelligence solution.
During data access, transformation, and cleansing, a huge amount of
valuable meta information needed by administrators as well as business
analysts is captured. The meta information should be made available in the
metadata repository.
IBM Visual Warehouse provides a rich set of transformation functions based
on SQL, (built-in) stored procedures, and user defined functions (UDFs). In
Selecting the Appropriate Architecture 29
addition it integrates with IBM and vendor products (such as IBM DataJoiner,
IBM DataPropagator, ETI Extract, and Vality Integrity) to enable additional
data sources, complex transformations, and data quality enhancements.
Staged Data
The data store of a Business Intelligence solution consists of a number of
different layers of data, ranging from detailed, historic transaction records, to
aggregated and summarized, subject-area-specific information. In order to
provide for consistent instances of the different individual layers, the data has
to be staged in a working area of the data store, where it can be processed
without compromising the integrity of the other layers. The staging area also
contains the tables to define and load the facts and dimensions of the
multidimensional models derived from the common dimension repository (see
below).
Common Data
The common data repository contains information that is of interest on an
enterprise level. It consolidates the data from the different line of business
applications and from other sources to provide a single, consistent view of the
business entities involved (for example, consolidating customer records from
the marketing and billing applications). It is also useful to promote data that is
shared between several business subject areas to the common data store.
The main driving factors for a common data repository are:
• Providing a single, consolidated view of the data
• Enforcing consistency of shared data across analysis domains
• Enabling reuse of data already extracted from the data sources
In order to reuse already extracted data for the implementation of a new
analysis application or data mart, the data in the common data store should
be as fine-grained as possible. Calculations based on business rules and
aggregations or summaries should be introduced only in the business subject
area specific data storage layer. This also ensures that the data is extracted
only once from the source systems, providing for a more economic design of
the extraction process and less impact on the operational systems and the
batch execution window.
In a Business Intelligence solution for multidimensional analysis, the common
data repository should also be used to provide a standardized set of the
major dimensions used throughout the enterprise, including all aggregation
paths and hierarchy levels. All the dimensions shared between business
subject areas (or data marts) should then be derived from these standardized
dimensions to ensure consistency. However, it is possible to derive specific
subsets of a standardized dimension, but whenever a certain hierarchy level
30 Visual Warehouse & DB2 OLAP Server
of a dimension is used in two different models, it has to be exactly the same.
Changes to the definition of a standardized dimension should be made only in
the common data layer and then propagated to the derived instances of that
dimension in order to implement a single point of control. This technique
supports consistent analysis across different business subject area solutions,
enabling small, easy to manage, models instead of monolithic, all-in-one
solutions. Business analysts can navigate between the linked models, using
drill-across operations. For an example for implementing drill-across, see 7.3,
“Linked Partitions” on page 192.
A common dimension repository with a single point of control can be
implemented by using the relational dimension tables provided by DB2 OLAP
Server in conjunction with Visual Warehouse Business Views to automatically
build the launch tables for dynamic dimension definition of the derived
dimensions, or by using a product called Hyperion Integration Server to
define and keep track of the common dimensions.
The common data layer can be accessed directly from a decision support
application, or by SQL drill-through from the multidimensional environment.
Due to the requirement for detailed data and the introduction of history
modeling, the common data area tends to grow large over time. Therefore a
scalable relational database technology (such as DB2 UDB) needs to be
deployed.
Business Subject Data
Whereas the common data store is designed to provide consolidation,
consistency, and reusability, the business subject data store is targeted to
providing analysis capabilities for a specific business domain or business
process. It is structured for easy access and contains aggregations, derived
information and precalculations based on the specific business rules, that
apply to the subject area. The business subject data can be accessed directly
from a decision support application, or by SQL drill-through from the
multidimensional environment.
Business subject data should also be kept in a relational environment,
because it allows for flexible combinations, transformations, and subsetting
as required by business analysts.
Multidimensional Data
Multidimensional data is derived from the business subject data. Its structure
is optimized for multidimensional analysis. It can be organized in a relational
star- or snowflake-schema, or it can be implemented as binary files. The
Selecting the Appropriate Architecture 31
dimensions should be derived from the common dimension repository
residing in the common data area.
Within the IBM Business Intelligence framework, the multidimensional data is
managed by Hyperion Essbase or DB2 OLAP Server. Visual Warehouse
Business Views and Visual Warehouse Programs are used to populate the
multidimensional data from the business subject layer and the common data
layer through the staged data layer. For a detailed description of this process,
see Chapter 5, “Populating the Multidimensional Model” on page 91.
Backup/Archive Data
The Business Intelligence environment also has to include a solution for
backing up and archiving of data. Even if fully redundant hardware (such as
uninterruptable power supplies, RAID disk arrays, and standby processors) is
used, errors can occur during operation, such as unintended deletion of
tables or databases. To get back to full production quickly, a backup has to be
available. Archiving serves a different purpose, that is, it moves older,
historical data that is not subject to frequent access to cheaper storage media
(like tapes), to contain the amount of active data in the warehouse, saving
disk space and processing time.
If all the data in the analysis environment is in relational format (as, for
example, with DB2 OLAP Server), the complexity of the backup and archiving
solution can be significantly reduced. All of the data elements from the
operational environment, the common data store, the business subject data
store, and the multidimensional data store can be managed in the same way,
for example, by using ADSM, which integrates seamlessly with DB2 on all
major platforms.
Structured Query Language
SQL is the standard interface between the relational data store and Business
Intelligence applications. However, not only is it used by the Business
Intelligence applications or users to query the data store, it is also used to
define the structure (tablespaces, tables, views, indexes) of the database.
Simple transformations, derivations, and aggregations are also performed
with SQL.
Visual Warehouse Business Views enable the use of SQL to manage parts of
the population process.
OLAP Engine
The OLAP engine provides the OLAP capabilities discussed in 1.2, “What Is
OLAP and Why Is It So Successful?” on page 5. It also provides an interface
32 Visual Warehouse & DB2 OLAP Server
between the Business Intelligence applications and the data store, offering a
standard API to the applications.
Within the IBM Business Intelligence framework, this function is provided by
Hyperion Essbase and/or DB2 OLAP Server.
Business Intelligence Applications
A broad range of Business Intelligence applications for all kinds of analysis
purposes is available. There are general-purpose tools as well as solutions
for specific industry segments, offering one or more of the following functions:
• SQL-based query and reporting
• Statistical analysis capabilities
• Multidimensional capabilities utilizing the power of OLAP servers
• Discovery-driven data mining algorithms and techniques such as
clustering, association detection, and classification (for example, IBM
Intelligent Miner)
Usually several different applications are implemented targeted to specific
analysis requirements.
Administration, Automation, Operation, Control
This architecture building block ties all the functions and processes of the
Business Intelligence environment together and enables the management of
the end-to-end environment from a single point of control, maximizing the
automation of the workflow and minimizing manual tasks. It enables
administrators to check the status of the population process and the data
stores at any given time, and it provides the mechanisms for failure recovery,
to ensure a consistent state of the analysis environment. It also provides for
monitoring and controlling user activities and workload.
Ideally, management of the Business Intelligence processes should not only
be documented by but also driven by the information in the metadata
repository, providing a tight integration between those two components.
Within the IBM Business Intelligence framework, Visual Warehouse
implements these important functions and acts as the integration and control
point, providing robust operations control, logging and restart facilities, and
statistics on data retrieval.
Metadata Repository
The metadata repository combines and manages all the information that is
available about the items in the Business Intelligence environment. It is one
of the most important components of the solution, because it allows end
Selecting the Appropriate Architecture 33
users to search for information and navigate through the system on the basis
of business terms. It ensures consistency of the analysis results and
interpretations, by providing common business definitions and rules of all the
items available for analysis.
It also contains all the information about the population process, that is, the
mapping rules, business formulas, and derivation rules. Ideally, it should be
possible to drive the complete population process from the metadata
definitions by establishing a tight integration of the
administration/automation/operation and control component and the
metadata repository. Administrators make use of the information regarding
the state of the different data stores and/or processes (for example, to find
out when a specific multidimensional model was loaded and calculated
successfully the last time).
The metadata repository usually has to derive and synchronize information
from a lot of different sources (for example, database catalogs, modeling
tools, CASE tools). In fact, managing all the metadata in a single, centralized
repository is too complex, because all the metadata producers would depend
on the interface and the availability of the central repository during the time
they process the information. It is more likely, and also easier to achieve, that
each tool manages its own metadata in a decentralized approach. In
essence, the approach consists of a metadata interchange facility as well as
a metadata integration hub used to validate the objects being interchanged
and provide specialized services within the Business Intelligence
environment.
In the case of IBM Visual Warehouse, metadata for both administrative and
business users is tightly integrated into the solution as the Visual Warehouse
Information Catalog. The metadata is made available through an interface
tailored to end users, including the ability to navigate and search using
business terms, to present data lineage and currency information, and to
automatically launch applications associated with the data. Additionally, the
Visual Warehouse Information Catalog spans a breadth of informational
objects allowing Web pages, spreadsheets, presentations, and other objects
to be represented along with information about data in the data mart or
warehouse.
The Visual Warehouse Information Catalog is designed to integrate with a
wide variety of products. It comes with extraction technology for a variety of
database management systems and end user tools. Source and target
metadata can be imported directly from relational database management
system (RDBMS) catalogs. Additionally, Visual Warehouse can exchange
metadata with any system that conforms to the Meta Data Interchange
34 Visual Warehouse & DB2 OLAP Server
Specification (MDIS) adopted by the Meta Data Coalition. In future releases,
Visual Warehouse will also support the exchange of metadata using XML
Metadata Interchange (XMI), which is in the process of being standardized by
the Object Management Group (OMG).
Navigation,Visualization, Publishing
This architecture building block is responsible for enabling users to specify
their requests and navigate through the analysis environment in an intuitive
way based on their perception of the business and using common business
terms. It is also responsible for the presentation of the analysis results and for
the publishing of the results in a workgroup environment. Recently Internet
technology has been adopted for these purposes, enabling all of the
operations described above from standard Web browsers. HTML and Java
are used to enable a low-maintenance thin-client presentation layer. Ideally,
monitoring the operation of the Business Intelligence environment and
searching for metadata should be possible over the Web.
Visual Warehouse provides Web interfaces for the operation of the
environment, such as work in progress information, as well as for searching
and navigating through the integrated metadata repository.
Analysis front-end-tools, such as Cognos PowerPlay, Business Objects, Brio
Enterprise, and Hyperion Wired for OLAP, offer Web browser clients for
integrating the Business Intelligence solution into the corporate intranet or the
extranet.
Groupware, such as Lotus Notes, can be used to distribute analysis results
and parts of the analysis process within a workgroup, using Object Linking
and Embedding (OLE) technologies in conjunction with the familiar mailing
function.
3.3.2 Additional Requirements for an End-to-End Architecture
In addition to the topics listed above, the solution has to be able to grow with
the business. Because Business Intelligence projects are iterative and
evolutionary, the architecture of the solution has to be scalable in terms of
data volumes, number of users, and workload. Therefore it has to be able to
utilize additional processing power and/or different platforms, and it has to
provide options for data partitioning, placement of data and processes, and
efficient storage management.
IBM’s Business Intelligence framework is one of the most comprehensive
offerings today. It enables the design of scalable end-to-end Business
Intelligence solutions with a very strong emphasis on integrating
Selecting the Appropriate Architecture 35
state-of-the-art technology from different vendors (for example, Hyperion,
ETI, Vality, Cognos, Business Objects, Brio and IBM). The core of this
integration framework for Business Intelligence solutions is IBM’s Visual
Warehouse.
3.4 Visual Warehouse
Visual Warehouse is an integrated product for building and maintaining a data
warehouse or data mart in a LAN environment. Visual Warehouse does not
simply create a data warehouse or an informational database; it provides the
processes to define, build, manage, monitor, and maintain an informational
environment. It integrates many of the Business Intelligence component
functions into a single product. It can be used to automate the process of
bringing data together from heterogeneous sources into a central, integrated,
informational environment.
Visual Warehouse can be managed either centrally or from the workgroup
environment. Therefore, business groups can meet and manage their own
information needs without burdening information systems resources, enjoying
the autonomy of their own data mart without compromising overall data
integrity and security in the enterprise.
3.4.1 Data Sources Supported
Visual Warehouse provides the capability to extract and transform data from
a wide range of heterogeneous data sources, either internal or external to the
enterprise, such as the DB2 family, Oracle, Sybase, Informix, Microsoft SQL
Server, VSAM, IMS, and flat files (for example, from spreadsheets). Data
from these sources is extracted and transformed based on metadata defined
by the administrative component of Visual Warehouse. The extract process,
which supports full refreshes of data, can run on demand or on an automated
scheduled basis.
3.4.2 Data Stores Supported
The transformed data can be placed in a data warehouse built on any of the
DB2 UDB platforms, including DB2 for Windows NT, DB2 for AIX, DB2 for
HP-UX, DB2 for Sun Solaris, DB2 for SCO, DB2 for SINIX, DB2 for OS/2,
DB2 for OS/400, and DB2 for OS/390, or on flat files. Visual Warehouse
provides the flexibility and scalability to populate any combination of the
supported databases.
Visual Warehouse also supports Oracle, Sybase, Informix, and Microsoft SQL
Server using IBM DataJoiner.
36 Visual Warehouse & DB2 OLAP Server
3.4.3 End User Query Tools
Once the data is in the target data warehouse, it is accessible by a variety of
end user query tools. Those tools can be from IBM, such as Lotus Approach,
or QMF for Windows, or from any other vendors whose products comply with
the DB2 Client Application Enabler (CAE) or the Open Database Connectivity
(ODBC) interface, such as Business Objects, Cognos Impromptu, and Brio
Query. The data can also be accessed using a popular Web browser with
additional Web infrastructure components.
3.4.4 Metadata Management
Visual Warehouse stores all the metadata in its control database and is
integrated with DataGuide, IBM’s metadata management tool, which is part of
the Visual Warehouse solution. The data warehouse model, which defines the
structure and contents of the data warehouse, is stored in the metadata
repository. For each data source to be accessed, Visual Warehouse first
extracts the metadata that describes the contents of the data source and
places it in the metadata repository. This metadata is then used to extract,
filter, transform, and map the source data to the data warehouse.
The metadata of Visual Warehouse can then be transferred to the Information
Catalog managed by DataGuide. With DataGuide, users can create an
Information Catalog, which contains graphical representations of the
metadata. DataGuide can be integrated with DB2 CAE entitled decision
support tools, which can be used to view the metadata specific to an object of
interest in the DataGuide Information Catalog.
3.5 The Architecture of Visual Warehouse
The Visual Warehouse architecture provides a fully distributed Client/Server
system that lets users reap the benefits of network computing. The
architecture consists of the following major components:
• Server
• Administrative Clients
• Agents
• Control Database
• Target Databases
Selecting the Appropriate Architecture 37
3.5.1 Visual Warehouse Server
Visual Warehouse Server, which runs on a Windows NT workstation or
server, controls the interaction of the various data warehouse components
and provides for automation of data warehousing processes by a powerful
scheduling facility, which allows calendar-based scheduling as well as event-
based scheduling. The server component monitors and manages the data
warehousing processes. It also controls the activities performed by the Visual
Warehouse agents.
3.5.2 Visual Warehouse Administrative Clients
The Administrative Client, which also runs on a Windows NT workstation or
server, provides an interface for administrative functions, such as defining the
business views, registering data resources, filtering source data, defining the
target data warehouse databases, managing security, determining the data
refresh schedules, and monitoring the execution of the data warehouse
processes. Visual Warehouse can support an unlimited number of
administrative clients and provides comprehensive security facilities to
control and manage client access to the administrative functions.
3.5.3 Visual Warehouse Agents
Visual Warehouse agents handle access to the source data, filtering,
transformation, subsetting, and delivery of transformed data to the target
warehouse under the direction of the Visual Warehouse Server.
Visual Warehouse agents run on Windows NT, OS/2, AS/400, AIX, and Sun
Solaris. Visual Warehouse supports an unlimited number of agents. Because
multiple agents can participate in the population of a data warehouse, the
throughput can significantly increase when multiple agents act
simultaneously. The agents primarily use ODBC drivers as the means of
communicating with different data sources and targets.
The Visual Warehouse agents architecture is a key enabler for scalable
Business Intelligence solutions.
3.5.4 Visual Warehouse Control Database
A control database must be set up in DB2 to be used by Visual Warehouse to
store control information used by the Visual Warehouse Server. The control
database stores all the metadata needed to build and manage the
warehouse. The information in the control database includes the mappings
between the source and target data, the schedules for data refresh, the
Business Views, and operational logs. The control database is managed by
38 Visual Warehouse & DB2 OLAP Server
the Visual Warehouse Administrator and used by the Visual Warehouse
agents. When a request for service is made to the Visual Warehouse Server,
the control information pertinent to that request is retrieved from the control
database and sent to the appropriate agent that actually provides the service.
It is to be noted that different warehouses could use different control
databases.
Advanced DB2 features, such as triggers and stored procedures, can be
used in conjunction with the Visual Warehouse control data to provide an
advanced operating environment. For instance, DB2 triggers can be used to
monitor log inserts and to send out alert signals through DB2 stored
procedures when a certain event occurs.
3.5.5 Visual Warehouse Target Databases
Target databases in a data warehouse contain the Visual Warehouse data
stored in structures defined as Business Views (BVs). When Visual
Warehouse populates a BV, data is extracted from the source, transformed
according to the rules defined in the BV, and then stored in the target
database. Multiple databases could be used as target databases for a data
warehouse.
Selecting the Appropriate Architecture 39
40 Visual Warehouse & DB2 OLAP Server
Chapter 4. Implementing a Multidimensional Model
In this chapter we describe how to implement a basic multidimensional model
by using DB2 OLAP Server or Hyperion Essbase. First we define the model
manually, using the Outline Editor. Then we show how to build the model,
using external input files or tables and so-called Data Load Rules. After
defining the model, we lead you through the process of loading and
calculating it. However, at this stage, we present only the default calculation.
For more comprehensive coverage of the available calculation options see
Chapter 6, “A Closer Look at Calculating the OLAP Database” on page 135.
4.1 Introduction to the TBC Sales Model
The example we use throughout this book is based on a hypothetical
company in the beverage industry. We refer to the company as TBC or The
Beverage Company. The company’s major products are various kinds of
drinks (for example, fruit drinks, cream sodas, and colas). These products are
sold in U.S. markets, which are categorized by city, state, and region. Our
model will be used to analyze financial data, such as sales and cost of goods
sold, and quantities sold. The data is collected monthly and is summarized by
quarter and year. An initial dimensional model is shown in Figure 7 on page
42 based on the notation technique introduced in the redbook Data Modeling
Techniques for Data Warehousing, SG24-2238.
© Copyright IBM Corp. 1998 41
Products Markets
Region
Product Group
TBC Sales State
Product-Key
Product Market-Key City
Time-Key
Measures
Quantity sold
Month
Profit (=Sales - COGS)
Quarter Sales
COGS (cost of goods sold)
Year
Time
Figure 7. Initial Dimensional Model Representation of TBC Sales
From an overall perspective, to build and load a DB2 OLAP (or Essbase)
cube representing the model shown in Figure 7, there are three main steps:
1. Define the model
This can be accomplished in two different ways:
• Manually (see 4.2, “Building the Database Outline Manually” on page 43
for details)
or
• Automatically, driven by data from an external source (see 4.3, “Building
the Database Outline Dynamically” on page 53)
2. Load the data
3. Calculate the data
42 Visual Warehouse & DB2 OLAP Server
4.2 Building the Database Outline Manually
In this section we explain how to build the simple TBC sales model manually.
Follow these steps:
1. Start the DB2 OLAP Server
2. Open the DB2 OLAP Server Application Manager
3. Connect to the DB2 OLAP Server
4. Create a new application
5. Create a new database
6. Create the dimensions
7. Create the members in each dimension
8. Define alias and member definitions
9. Save the model
Start DB2 OLAP Server by opening a command window and enter:
Essbase
followed by the DB2 OLAP Server password, and press Enter, or selecting
Programs => DB2 OLAP Server => Start Essbase Server from the
Windows NT Start menu.
(Note that the DB2 Server can also be started automatically as a Windows NT
service.)
Then open the DB2 OLAP Administration window by selecting Programs =>
DB2 OLAP Server => Essbase Application Manager from the Windows NT
Start menu and connect to the DB2 OLAP Server by choosing Server =>
Connect... (see Figure 8).
Implementing a Multidimensional Model 43
Figure 8. Connect to the Server
Click on File => New => Application... in the Application Manager window to
create an application called TBC on the server (see Figure 9 and Figure 10).
Figure 9. Create a New Application
Figure 10. Create a New Application (continued)
44 Visual Warehouse & DB2 OLAP Server
Note that the Location selection identifies where DB2 OLAP Server is to place
the system control files for the application. Each application has its own
subdirectory within the Essbase directory. Note that the relational tables for
the cube are not created in DB2 until we save the outline later in this section.
After creating the TBC application, we have to create a database called
Simple in TBC. This will create the database system control files on the
server. Note that we are creating a Normal Database, not a Currency
Database. A Currency Database enables multicountry currency conversions.
Click on File => New => Database... in the Application Manager window to
create a database called Simple on the server (see Figure 11 an Figure 12).
Figure 11. Create a Database
Figure 12. Create a Database (continued)
Now that we have a database, we can create the outline. DB2 OLAP calls the
structure of the cube an outline. It essentially consists of the dimensions, the
Implementing a Multidimensional Model 45
members of each dimension, and the facts as discussed in 3.2, “Architecture
and Concepts of DB2 OLAP Server and Essbase” on page 22.
To create the main dimensions, Select Application TBC => Database
Simple. Select the Database Outlines icon and then Open.
Across the top of the Outline Editor are a number of icons whose meaning is
displayed in text at the bottom of the Application Manager window. Select the
Add a child to the selected member icon, as indicated by the cursor
position in Figure 13, and enter Product as the first dimension name.
Figure 13. Create a Dimension
After you press Enter, you can add additional dimensions at the same level.
So enter Market, Year, and Measures as three additional dimensions. These are
then the main dimensions in our cube.
By default, the structure and operational capabilities of the dimensions are
treated the same in DB2 OLAP Server (see also OLAP rule, “6. Generic
Dimensionality” on page 7). However, DB2 OLAP Server has two commonly
used specific dimension types, a Time dimension to enable time series type
calculations of data, and an Accounts dimension for numerical facts. Specify
the Year dimension as time-based by highlighting Year => Time Dimension
Type on the second row of icons, and the Measures dimension as
account-based by highlighting Measures => Account Dimension Type on
the second row of icons (see Figure 13). You should see Time and Accounts
in red appear next to Year and Measures respectively (see Figure 14).
Now we need to enter each member within a dimension. This is effectively
building the hierarchies within the dimensions. Highlight Product, and click
46 Visual Warehouse & DB2 OLAP Server
on the Add a child to the selected member icon as indicated in Figure 14.
Enter product groups of 100, 200, 300, and 400 as members of the Product
dimension (see Figure 14).
Figure 14. Create a New Member
Then create the next level of the hierarchy under product group 100 by
highlighting product group 100 and clicking on the Add a child to the
selected member icon. Enter product classes of 100-10, 100-20, and 100-30
as children of product group 100 in a similar fashion.
Additionally, for the Measures dimension hierarchy, enter Quantity Sold, and
then in a hierarchy enter Profit, Sales, and COGS (cost of goods sold) as
indicated in Figure 15. Note that in loading data into the cube, it is the
Measures dimension into which numerical data is loaded, effectively
equivalent to the measures in a star-schema fact table.
As we add members, notice that, as a default, they are consolidated as
additions in the hierarchies. Thus, at this point we will have defined Profit as
Sales + COGS (which is wrong from a business perspective). We will change
this shortly.
Implementing a Multidimensional Model 47
Figure 15. Create Measures
What we have entered so far is a very simple model where the values for all
members in a hierarchy are aggregated to a total number at the top level
member of the hierarchy.
However, there are other attributes of members that we need to enter, such
as aliases for the product groups and product classes. Aliases are a useful
feature for making the information in the model understandable and readable
across user communities in the enterprise. Although the users of one
department might prefer to work with product numbers, such as 100 and
100-20, this may not be very meaningful to users of another department, to
whom Colas and Root Beer make more sense. Specify attributes for each
member by highlighting the member and selecting the Define the attributes
for the selected member icon.
In the Outline Editor, select product group 100 and click the Define the
attributes for the selected member icon.
48 Visual Warehouse & DB2 OLAP Server
Figure 16. Define Alias and Store Data
As you can see in Figure 16, there are many attribute definitions. Enter Colas
as the Alias for product group 100. Some of the most common settings used
for attributes in this panel are: for Consolidation, addition, subtraction, and
ignore, and for Data Storage, Store Data, Dynamic Calc and Store, Dynamic
Calc, Label Only, and Shared Member. These attributes are described below.
• Addition- Adds the member to the result of previous calculations
performed on other members of the hierarchy
• Subtraction - Multiplies the member by -1 and then adds it to the sum of
previous calculations performed on other members of the hierarchy
• Ignore - Does not use the member in consolidation to its parent
• Store Data - Stores the data value with the member
• Dynamic Calc - Does not calculate the data value until a user requests it.
After presenting the result to the user, the value is discarded.
• Dynamic Calc and Store - Does not calculate the data value until a user
requests it. Then it stores the data value.
• Label Only - Creates members for navigation only, that is, members
contain no data values
Implementing a Multidimensional Model 49
• Shared member - Shares values between members. This enables one
member to be consolidated in two or more different hierarchies.
Now, we can also alter the COGS attributes and change the consolidation
from addition to subtraction, so that Profit is now Sales - COGS.
Note that the panel shown in Figure 16 also provides a quicker way of
manually entering dimension members, through the Add Sibling... and Add
Child... buttons, and the Prev... and Next... buttons.
See the resulting dimensions defined in the outline in Figure 17.
Figure 17. Resulting Dimensions
In 1.2, “What Is OLAP and Why Is It So Successful?” on page 5, we discuss
defining dimensions as dense and sparse. As a default, when you create a
new dimension, the dimension is defined as a dense dimension, and for this
simple model, this is all we have done.
To review the dense and sparse settings of the dimensions and set them
according to the design requirements, select Settings => Data Storage from
the Essbase Application Manager window, as indicated in Figure 18.
50 Visual Warehouse & DB2 OLAP Server
Figure 18. Default of Dense Dimensions
On the Data Storage pop-up window, we can review possible optimal
recommendations by selecting the Recommend>> button. This provides us
with the potential blocks and bytes per block for different configurations of
dense and sparse for the dimensions defined (see Figure 19).
Note that DB2 OLAP Server, however, knows only our outline, and not the
input data we will be loading or the characteristics of that data. Thus any
recommendations or ratings provided should not be taken at face value. A full
requirements and design process in designing the cube should be performed,
and it is this process that is the basis of the dense and sparse settings for the
cube.
The whole area of configuring dense and sparse appropriately and the impact
of dense and sparse on performance are covered in Chapter 9,
“Performance” on page 219.
Implementing a Multidimensional Model 51
Figure 19. Dense and Sparse- Possible Configurations
Select File => Save As...from the Application Manager window to save the
outline to the server, as shown in Figure 20.
It is at this stage that the relational tables are created in DB2, with a separate
table for each dimension. The member names and attributes are inserted into
these tables, along with their corresponding attributes, such as their alias
name and whether they are additive or subtractive in consolidation. For a full
description of the underlying DB2 tables, see 15.2, “DB2 OLAP Server
Tables” on page 313.
52 Visual Warehouse & DB2 OLAP Server
Figure 20. Saving the Outline
As you can see, the process of manually entering the outline can be time
consuming and subject to typing errors. It is therefore not suitable for large
dimensions with many members.
4.3 Building the Database Outline Dynamically
DB2 OLAP Server provides the ability to load dimensions from external
sources, like database tables or flat files. Most of the member attributes we
defined in 4.2, “Building the Database Outline Manually” on page 43 can be
loaded dynamically.
When considering dimensions that represent, for example, products, markets
or customers, which may have hundreds or thousands of members, manual
creation is no longer feasible. In addition the structure and the hierarchies of
such dimensions are subject to change, for example, new products are
introduced or new product groups are defined. It is therefore necessary to be
able to build and maintain the outline dynamically.
In this section we explain how to create an OLAP Server database outline
dynamically with the help of Visual Warehouse Business Views and flat files
by using Data Load Rules.
The external tables or files contain the information necessary to define the
structure of each of the dimensions. Each single member of the dimension is
associated with its corresponding position within the hierarchy. This mapping
can be done using three different mechanisms:
• Each column in the input data set is associated with its corresponding
level number within the dimension hierarchy, starting bottom-up with 0 for
Implementing a Multidimensional Model 53
the most detailed level (or leaf-level) and counting up until the highest
aggregation level is reached, as shown in Figure 21 on page 54. This
reference method is called level reference.
• Each column in the input data set is associated with its corresponding
generation number within the hierarchy, starting top-down with 1 for the
most aggregated level of data and counting until the most detailed level of
data is reached as shown in Figure 22 on page 55. This reference method
is called generation reference.
• The records of the input file contain two columns describing direct
parent/child references as shown in Figure 23 on page 55.
Input data set:
West California San Francisco
West California Los Angeles
West California San Diego
West Arizona Phoenix
2 1 0 Level-Reference Number
Hierarchy:
West 2
California 1
San Francisco 0
Los Angeles 0
San Diego 0
Arizona 1
Phoenix 0
Figure 21. Level References
54 Visual Warehouse & DB2 OLAP Server
Input data set:
West California San Francisco
West California Los Angeles
West California San Diego
West Arizona Phoenix
1 2 3 Generation-Reference Number
Hierarchy:
West 1
California 2
San Francisco 3
Los Angeles 3
San Diego 3
Arizona 2
Phoenix 3
Figure 22. Generation References
Input data set: Hierarchy:
West California West
West Arizona California
California San Francisco San Francisco
California Los Angeles Los Angeles
California San Diego San Diego
Arizona Phoenix Arizona
Phoenix
Parent Child
Figure 23. Parent/Child References
Dimension descriptions and attributes can also be defined using this
technique.
Ideally, there is one input data set (flat file or table) for each dimension of the
model. We will refer to an input data set used to build a dimension as a
launch table, and we will use Visual Warehouse Business Views to prepare
launch tables for our multidimensional data mart in DB2.
Implementing a Multidimensional Model 55
The mapping of the columns to the corresponding references is established
through the Data Load Rules. They are maintained by using the Data Prep
Editor.
4.3.1 Building the Product Dimension (Using Level References)
In this example, we assume, that the outline already contains an entry for the
base Product dimension. Therefore we have to associate the data in the
launch table with this dimension in the outline.
We use an SQL table, defined as a Visual Warehouse Business View within
the relational storage area of our data mart as the launch table for the
Product dimension. The SQL table is called IWH.ALL_PRODUCTS and is a
kind of product master table that contains one row for each product (see
Figure 24). For a description of the Business Views that relate to the
ALL_PRODUCTS table, refer to 5.3, “Business Views Used for the TBC Sales
Model” on page 113.
PROD_CLASS_CODE PROD_CLASS_DESC PROD_GRP_CODE PROD_GRP_DESC ...
100-10 Kool Cola 100 Cola ...
100-20 Diet Cola 100 Cola ...
... ... ... ...
300-10 Dark Cream Soda 300 Cream Soda ...
Figure 24. The All Products Business View (Sample Contents)
To create a Data Load Rule for the Product dimension:
1. Connect to the server.
2. Select the database and application.
3. Click the Data Load Rules button.
4. Click the New button on the server window.
This opens the Data Prep Editor.
5. Click File in the Application Manager window and select the Open SQL
entry.
6. Verify the correct entries in the Select Server, Application and Database
window and click OK (see Figure 25).
56 Visual Warehouse & DB2 OLAP Server
Figure 25. Essbase Application Manager
7. Use the Define SQL Window to define the database and the select
statement to retrieve the relevant columns from the launch table for the
Product dimension.
Each selected row from the ALL_PRODUCTS table corresponds to a member
in the Product dimension of the outline. To avoid duplicate rows, which are
rejected during the dimension build, we included the distinct clause in our
SQL statement (see Figure 26).
Implementing a Multidimensional Model 57
Figure 26. Define SQL Statement
When you click the OK/Retrieve button, you will get the SQL Connect
window. If already logged on with the correct user ID and password, you can
simply click the OK button; otherwise fill in the appropriate user ID and
password and click OK. The Data Prep Editor should now contain the result
of the query (see Figure 27 on page 59).
58 Visual Warehouse & DB2 OLAP Server
.
Figure 27. Result in Data Prep Editor
Note: The error message (see Figure 28) you may get can be misleading
because there may be a connection to the server but a definition in the SQL
statement is wrong. Check the DB2 OLAP Server Application Log File to
investigate the cause of the error (refer to 10.4, “DB2 OLAP Server” on page
259).
Figure 28. SQL Access Error Message
Having retrieved the layout of our Product dimensions, we will define the
rules that apply for loading the data into the outline.
First we have to associate an outline with our Data Load Rules.
Implementing a Multidimensional Model 59
In the Data Prep Editor, click the Associate Outline icon (see Figure 29) and
select the correct Server, Application and Database from the list boxes. The
object type Outlines should automatically be selected. A double-click on the
object brings us back to the Data Prep Editor.
Figure 29. Data Prep Editor Icons
At this stage, also make sure, that the View Dimension Building Fields
button is clicked.
Next we have to associate field attributes with the input file. This is where we
associate the columns of the SQL result with the corresponding structure in
the dimension definition of the outline. We use level references, as described
above. Level 0 is always the lowest level of the hierarchy or the leaf level.
That is also the level where the data is loaded during the load operation. At
the higher levels the data is aggregated during the calculation phase of the
cube building process. In our example we have the PROD_CLASS_CODE
(for example, 100-10) as the lowest level and the PROD_GRP_CODE (for
example, 100) as the next higher aggregation level of the Product hierarchy.
Click the Define Attributes icon (see Figure 29 on page 60). This will open
the Field Attributes window (Figure 30 on page 61). In our case we can use
the default setting for the Global Attributes. Click the Dimension Building
Attributes tab to define the type and the corresponding dimension for each
column in the input table. Make sure to select the appropriate dimension from
the outline. The aliases (descriptions) for a level or member can be
associated by using the same level reference number. As we have defined no
consolidation attribute to the fields, the default attribute, +, is used, which
adds the measures to the current total during the calculation of the
dimension.
60 Visual Warehouse & DB2 OLAP Server
Figure 30. Dimension Building Attributes
Now use the Dimension Build Settings dialog box to set options that apply
to all the dimensions we build dynamically.
The Global Settings tab should already be selected. As we do not use an
alias table to store the aliases for member names, we can use the default
setting. If the list box is empty, click the Outline button to associate the
dimension build rules file with an outline. In the Data Configuration option box
we define whether to use our own settings to specify that the data be dense
or sparse, or let Essbase choose these settings. For our TBC Product
dimension we set it to dense. For a detailed discussion of setting dimensions
dense and sparse see Chapter 9, “Performance” on page 219.
Implementing a Multidimensional Model 61
Figure 31. Dimension Build Settings
With the Dimension Build Settings, we specify which of the methods
described above we want to use to map the columns of the launch table to
specific levels within the dimension hierarchy during dynamic dimension
build.
To open this dialog box, click the Dimension Build Settings tab. If the
Dimension list box is empty, click the Outline button and associate the rules
file with an outline. Essbase populates the Dimension list box with the
dimensions of the associated outline. Select the corresponding build method
(in our case we use the level reference method) for the Product dimension.
In the Existing Members option box we can specify how Essbase changes
existing members in the outline. In our example we activate the Allow
Attribute Changes checkbox to allow attribute changes of existing members
if new attributes are specified in the data source.
As the data may not always be sorted in the source and we may merge the
source with existing members, we can choose to sort the members in
ascending order (Member Sorting option box). In the Members Update option
box the default option merges new members with existing members in the
outline.
62 Visual Warehouse & DB2 OLAP Server
Now click the Dimension Definition tab. Here we can specify whether the
dimension is derived from an existing outline or defined in the Load Rules.
When you create new dimensions you must define them in the Load Rules.
For our example we used the outline.
The next and last step of our Data Load Rules definition is to verify the
correct settings of all parameters in the database outline. To open the Verify
Outline dialog box click on the Verify icon (see Figure 29). If everything is
correct you will get a message box with an OK message, which means that all
mandatory information to build the dimension outline is available and
syntactically correct. If this is not the case, the Item Name list box lists all
members that are potentially in error. The Errors list box provides an error
description or warning message for the selected member.
After successful verification, save the Load Rules.
Now we are ready to load the outline. In the Essbase Application Manager
Window select Database => Load data. From the Data Load dialog box (see
Figure 32) we can load data from an external data source to an Essbase
database outline or build dimensions in an existing outline, using external
data.
Figure 32. Data Load
Implementing a Multidimensional Model 63
Fill in the appropriate information for the Server, the Application, and the
Database. In the Type option box select the kind of data source to load. In our
case, we are loading data from SQL. If necessary enter the SQL user name
and password.
When loading SQL data sources you must use Data Load Rules. The Use
Rules checkbox tells Essbase to use Load Rules to load the data source and
activates the Find... button. Click the Find... button to activate the Open
Server Rules Object window. Here you can select the appropriate Load
Rules, in our case Products. Click OK to go back to the Data Load window.
The Error Output File text box lists the location of the error log file. Any file
can be specified. If this text box is blank, errors are not captured!
In the Options group box check Modify Outline. This tells Essbase to use the
associated Load Rules to make changes to the outline during the load by
adding or changing members and dimensions found in the data source for the
outline and described in the Load Rules.
As we do not want to load data at the moment, the Load Data checkbox must
be unchecked. When the Interactive checkbox is checked, you are prompted
each time a data source fails to load. Essbase tells you which data source
failed to load and asks whether you want to continue reading the remaining
data sources.
When you click the OK button, the outline should load with the Product Load
Rules and the Visual Warehouse Business View acting as a launch table.
After a successful load, the outline for the Product dimensions should look
like Figure 33 on page 65.
64 Visual Warehouse & DB2 OLAP Server
Figure 33. Product Outline after Loading Product Rules
4.3.2 Building an Alternative Aggregation Path
We have now dynamically created the Product dimension including all
products arranged in a hierarchical structure. However, it is often necessary
to analyze the sales data based on a different grouping for certain products
and aggregated in a different way. Therefore we want to introduce an
alternative aggregation path in addition to the hierarchy just created. To do
that we can use shared members. Shared members can be used to calculate
the same member across multiple groups, that is, one individual member can
be referenced in more than one aggregation.
In our TBC sales analysis example, we want to define a separate group
containing all of the Diet drinks. Therefore we want to add all Diet drinks
Implementing a Multidimensional Model 65
within the Product dimension to a new product group and total it on this level,
but we do not want the total to be propagated to the Product dimension.
We use the Visual Warehouse Business View called DIET_PRODUCTS to
load the outline for the shared Diet members. To do this we have to create a
new Data Load Rule. Use the same steps you need to create the Data Load
Rules for the standard Product dimension until you have to define the SQL
statement.
Figure 34 shows the SQL statement to use. After retrieving the data from the
DIET_PRODUCTS table we have to define the Dimension Building Attributes
(within the Field Attributes). As we are creating shared members at the same
level and want to use the level references build method, we have to make
sure that we have a reference to the primary level (for example, 100) and the
new product group (for example, Diet) specified in each input record (see
Figure 35).
In the DIET_PRODUCTS Business View we have a column named
PROD_GRP_ATTRIB. Attributes are used to assign a specific behavior to a
member. In our example this column contains the tilde character (~). This
attribute tells Essbase to exclude this level from consolidation when it builds
the outline. (For a list of all member codes, see Chapter 13 of the Essbase
Database Administrations Guide, SC26-9238.)
Figure 34. SQL Definition for Shared Diet Members
66 Visual Warehouse & DB2 OLAP Server
We also have to add the PROD_GRP_ATTRIB field as an attribute for the
level 1 member, because we do not want the Diet products aggregated twice
on the Product level. We only want the Diet products to be summed up on
their real member level, not on the shared member level.
Figure 35. Rules File Editor for Diet Product Rules
After you have run the SQL query and defined the Dimensions Building
attributes, the Rules File Editor should look like that shown in Figure 35, and
the resulting outline should be the same as the outline in Figure 36.
Figure 36. Outline with Inserted Diet Products
4.3.3 Building the Market Dimension
In the previous section we saw how to expand a dimension. Now we are
going to insert the Market dimension from scratch. There is no reference to
Implementing a Multidimensional Model 67
this dimension in the outline, so we first have to define our new base
dimension.
To create the Market dimension we have to follow nearly the same steps we
used to create the Load Rules for the Product dimension:
1. Open the Essbase Application Manager
2. Connect to the server
3. Select the TBC Application and the Simple Database
4. Click the Rules File Icon and click New. The Data Prep Editor opens.
5. From the Essbase Application Manager, select File =>Open SQL...
6. In the Select Server and Database window, use the selected parameters
by clicking the OK button. This brings up the Define SQL window. The
data for the Market dimension is stored in the IWH.REGIONS Business
View. The Define SQL window should look like Figure 37 to extract the
data for the Load Rules.
Figure 37. Define SQL Window for the Market Dimension
7. To define the new Market dimension, click the Dimension Build Settings
icon and select the Dimension Definition tab (see Figure 38). The dialog
box that appears is used to define new dimensions to be added to the
outline or to change existing dimensions during a dynamic dimension
build. Under Dimensions From, specify the Rules File option. In the Name
text box enter the name of the dimension you want to create.
68 Visual Warehouse & DB2 OLAP Server
Figure 38. Dimension Definition
8. Click the Add button to add this dimension to the list of dimensions. This
new dimension name is also present in the outline pull down menus.
9. The process of defining the remaining levels and aliases is the same as
we used for the Product Level. Therefore we have to define our SQL
statement to load the data into the Data Prep Editor. Figure 39 shows the
result. We also have to define the levels according to our retrieved data
and associate the outline file.
Implementing a Multidimensional Model 69
Figure 39. Data Prep Editor for the Market Dimension Rules
After you run the Market Load Rules for dimension building, the new outline
for the Market dimension should look like the outline in Figure 40.
70 Visual Warehouse & DB2 OLAP Server
Figure 40. Outline for the Market Dimension of the TBC Sales Model
4.3.4 Building the Time Dimension (Using Parent/Child References)
A different method of building dimensions is the parent/child method. In the
data source there must be a parent/child relationship assigned for each
member. The parent/child data sources must contain at least two columns,
the parent column, and the child column. There can be additional columns to
describe, for example, aliases or attributes, but there cannot be more than
one relationship in a single column.
Implementing a Multidimensional Model 71
Figure 41. File Structure to Build the Parent/Child Outline
For the Year dimension of the TBC sales application we created a file called
TIMEDIM.TXT containing the parent/child structure shown in Figure 41. If the
file is placed on the client, it can be found and opened by clicking the File
System button that appears when you select the client.
To define the Data Load Rules:
1. In the server window click the Data Load Rules icon and click the New
button.
2. To load the associated data for the outline into the rules editor, select
File=>Open Data File in the Essbase Application Manager window and
select the appropriate data file (see Figure 41).
3. Select the Set Dimensions icon (see Figure 29 on page 60) and click the
Dimension Build Settings tab in the Dimension Build Settings window
(see Figure 42). The Dimension list box lists the dimension to which to
apply the option. If the Dimension list box is empty, click the Outline
72 Visual Warehouse & DB2 OLAP Server
button if the Year dimension is already in the outline or define a new
dimension as described in the previous section.
Figure 42. Dimension Build Setting for the Year Dimension
4. This time, we specify the Use Parent/Child References option in the
Build Method option box.
5. Select Define Attributes (see Figure 29 on page 60) =>Field Attributes
=> Dimension Building Attributes. The Field Number label displays the
number of the currently selected field.
6. In the Field Definition group box, which sets the field type, number, and
dimension for the currently selected field, select Parent for Field 1, which
indicates that the field contains names of a parent reference.
7. For Field 2 select Child, which indicates that the field contains names of a
child reference.
8. Select the Set Data File Attributes dialog box (see Figure 29 on page
60). The File Delimiter has to be set to semicolon, to match the delimiter
used in our source file (see Figure 43). Click OK.
Implementing a Multidimensional Model 73
Figure 43. Data File Attributes
9. Verify the rules definition by clicking the Verify icon (see Figure 29 on
page 60). If there are no errors, you can save the rules file.
10.To load the Year dimension into the outline, you can use the same
procedure you used to load the other dimensions, except that you must
now specify Data Files (not SQL as you specified for the preceding
dimension loads). Refer to Figure 32 on page 63.
4.3.5 Copying Dimensions, Members, and Outlines
In some cases it is useful to copy a complete outline from one database to the
other. In other cases only the definition of a single dimension is needed in an
already existing outline. DB2 OLAP Server supports both possibilities.
4.3.5.1 Copying an Outline
The way to copy an entire outline is simply to use File => Open and Save
as... with a different name.
4.3.5.2 Copying a Dimension or Member
To copy a single dimension or a member from one application or database to
another follow these steps:
1. Open the outline from which to copy the dimension or member.
2. Select the dimension or member that should be copied
3. In the Essbase Application Manager window click Edit => Copy to copy
the selected object to the clipboard.
74 Visual Warehouse & DB2 OLAP Server
4. Open the application or database outline to which to copy the dimension
or member.
5. Select the dimension or member after which the new object should be
placed.
6. In the Essbase Application Manager window click either Paste Sibling to
paste the clipboard contents as a sibling of the already selected member
or Paste Child to paste the clipboard contents as a child of the already
selected member.
With this method it is easy to quickly tailor outlines for new applications.
4.4 Loading the Data
In the previous section, we discussed how to dynamically build dimensions
for the DB2 OLAP or Hyperion Essbase cube. There is a similar process for
loading the data (for example, the facts) into the cube. Typically, Load Rules
are used, where each of the columns in the data input file is described. Note
that it is also possible to do free-form loading of data without Load Rules, but
this approach is rarely used, as it is complex to create manually. It is typically
used to move data between cubes, as a DB2 OLAP Server export creates the
exported file as a free format file.
Data Load Rules are, in essence, a set of operations that DB2 OLAP Server
performs on the data when it loads it into a DB2 OLAP Server database. The
operations are stored in the Load Rules that tell DB2 OLAP Server how to
load the SQL data and map it to the database outline. Loading data from an
SQL table requires the use of Load Rules.
At this stage of the process we have the cube fully defined with all the
dimensions, members, and member attributes. Follow these steps to load
data and calculate the cube:
1. Open the Data Load Rules.
2. Define the SQL input file (or flat file input).
3. Map the dimensions to the input file columns.
4. Map the Measures dimension to the input file columns.
5. Set the data file attributes.
6. Associate the outline with the Load Rules.
7. Save the Load Rules.
8. Clear the data in the cube (if necessary).
Implementing a Multidimensional Model 75
9. Load the data into the cube.
4.4.1 Loading Data from a Table
Select Data Load Rules File => New to open a new Load Rule. Select File
=> Open SQL... (as shown in Figure 44). Note that, just like in any other
development environment, we recommend using a small input set to initially
load the cube, until the cube design has been validated.
.
Figure 44. Define the SQL Input Table
Next we need to select the required database and the required SQL
statement that provides us with the input data to be loaded. In the Define SQL
window (Figure 45) select Data Source TBC_TGT, and enter:
Select ORDER_MON_YY, PROD_CLASS_CODE, CITY, QUANTITY, SALES, COGS
FROM IWH.LAUNCH_TABLE_FOR_S
Select the OK/Retrieve icon, and then select OK when the SQL Connect
dialog box appears.
76 Visual Warehouse & DB2 OLAP Server
Figure 45. Connect to the Database
The data that is loaded into the cube is at the lowest level in each dimension.
In our simple cube, the dimensions are Product, Market, Year, and Measures.
To load data, we therefore need input data of product class, city, and month,
along with the numeric values for the Measures members of quantity, sales,
and COGS.
With the data at this lowest level, DB2 OLAP Server can then aggregate the
higher levels in each dimension hierarchy during the calculation step of the
process of building the cube.
However, it is also possible to load data at higher levels of the hierarchy. To
prevent the aggregation of the missing lower level data from overwriting the
higher level data during calculation, set the AGGMISSG parameter to OFF in
the corresponding calculation script or in the Database Settings window.
As you can see in Figure 46, the SQL table column names are not names that
DB2 OLAP Server recognizes. We need to map the input columns
ORDER_MON_YY, PROD_CLASS_CODE, and CITY, to dimensions Year,
Product, and Market, by calling them Year, Product, and Market.
Implementing a Multidimensional Model 77
Figure 46. Sample Input Data to Be Loaded
Select column ORDER_MON_YY and click the Define attributes for the
selected column icon. You need to associate this column with the Year
dimension. Select Outline => Year (Figure 47).
DB2 OLAP Server determines where to put the data within the
multidimensional structure by using name references. Therefore you have to
map the column names from the SQL table to the names used in the outline
definition.
78 Visual Warehouse & DB2 OLAP Server
.
Figure 47. Mapping of the Year Dimension
The Measures dimension, which is identified as an Accounts dimension type,
is slightly different, as it stores numeric values. As you can see by looking at
the one row example in Figure 46, there is no method of identifying what each
numeric value represents without explicitly naming the columns. Thus, the
three Measures data fields provided are mapped to Quantity Sold, Sales, and
COGS (see Figure 48).
Note that if the SQL table column names had been Year, Product, Market,
Quantity Sold, Sales, and COGS, the explicit mapping would not be required.
Implementing a Multidimensional Model 79
Figure 48. Mapping of the Measures Dimensions
For the Measures data values, quite often numeric data may be either
missing or unknown. In this case, DB2 OLAP Server requires that you have a
value of #MI or #MISSING in the data field in the input file. One easy way of
achieving this is, for each Measures data value, to use the Load Rules to
replace any blank values with #MI. When the data is then viewed in the cube,
it is #MISSING that is actually seen.
Note that in Figure 49, the Replace area actually has a blank in it. Once a
blank is entered, the Add button to add the replacement value will be
enabled.
Note: #MISSING is actually specifically for numeric data fields. If it is used for
dimension fields, DB2 OLAP Server will append #MI to the dimension
member and will not find it in the outline. For example, it will look for a
dimension member of 100#MI, rather than 100.
Replacement strings can be defined for data fields in data loads and in
dimension building inputs, with multiple replacement strings specified per
input field.
As you can see from the examples above, Data Load Rules can be used to
carry out a lot of fairly complex data transformation and cleansing operations.
However, we suggest using the functions of the population subsystem (that
80 Visual Warehouse & DB2 OLAP Server
is, the Visual Warehouse Business Views) to get the data right before it is
transferred into the OLAP environment.
Figure 49. Replacing Missing Data Values with #MI
In addition to allowing you to replace data input values with another value,
DB2 OLAP Server allows you to specify record selection and rejection in the
Data Load Rules. Select Record => Reject to reject specific records.
Note that you can also use the boolean operators and and or in the record
rejection process. Thus, for input data from an SQL source, record selection
and rejection as well as normal SQL where clauses can be specified. Note,
however, that selection or rejection is typically performed in the SQL
definition. The selection or rejection process is more useful for input of flat file
data.
In the example in Figure 50, DB2 OLAP Server will not load records in the
input data file where Product includes Caffeine Free or where Product equals
100-10.
Implementing a Multidimensional Model 81
Figure 50. Rejecting Records in Data Load Rules
We have now defined our data input column definitions.
Note that in Figure 51, where the dimension name includes a space, double
quotes are required around it for the mapping to be validated, for example,
"Quantity Sold".
Figure 51. Associating an Outline with the Load Rules
82 Visual Warehouse & DB2 OLAP Server
The next step is to save the Load Rules definition, just as you did in the build
dimensions process. Select the TBC sales outline and then File => Save
As... and save the Load Rules to the server as LOADDATA (see Figure 52).
Figure 52. Save the Load Rules to the Server
We now have Load Rules to load our input data. If existing data in the cube
has to be cleared before loading, select Database => Clear Data => All to
remove all data in the cube, and leave the existing outline in place (see
Figure 53).
Figure 53. Clearing Data in the Cube
Now we load the data into the cube (Figure 54) by selecting Database =>
Load Data.
Implementing a Multidimensional Model 83
Figure 54. Loading Data into the Cube
Check the SQL and Load Data boxes (as shown in Figure 55). Select Use
Rules => Find. Select the LOADDATA Rules we created earlier for the Load
Rules to be used. Check that the error file is correct, and then select OK.
Figure 55. SQL Data Load
You have successfully loaded your first cube.
The "SQL Dataload successful" message in Figure 55 does not necessarily
mean that the data has been successfully loaded with no errors. If you have
84 Visual Warehouse & DB2 OLAP Server
errors in the dataload, DB2 OLAP Server writes the errors to the
DATALOAD.ERR error output file. If you want to stop loading data as soon as
you get an error, select Abort on Error during dataload. Otherwise, be sure
to check the error file after each load. If you do not select Abort on Error
during dataload, the data will be loaded, and any data with errors is stored in
the error file.
4.4.2 Loading Data from a Flat File
The process of loading data from a flat file is similar to loading data from a
SQL table. However, there are a few main differences to note.
We use a comma delimited flat file in this example to illustrate the process.
Select Database Simple. Select the Load rules icon, then New to create
new Load Rules for flat file loading. Open the flat file by selecting File =>
Open Data File... and selecting the flat file.
We then need to change the data file attributes to define the file as comma
delimited. Select the Set the data file attributes icon. Select Comma for a
comma delimited file (see Figure 56).
Figure 56. Defining the Input File As Comma Delimited
Flat files have a number of options in terms of whether load definitions are
stored in the flat file itself or as part of the Load Rules. For example, within
our flat file, we created record 1 of the file as the data load field names of
Year, Product, Market, Quantity Sold, Sales, and COGS (Figure 57).
Implementing a Multidimensional Model 85
Figure 57. Flat File Input Showing First Record as Mapping Names
Within the Load Rules, we then specify record number 1 of the input file as
the header record. Select the Set the data file attributes icon, then the
Header Records tab. Set Record containing data load field names to 1 (see
Figure 58).
Note that, similarly, dimension building field names and header names can
also be predefined as part of the input file rather than explicitly through the
Data Load Rules.
Figure 58. Define the Data Load Field Names Record
86 Visual Warehouse & DB2 OLAP Server
As you can see in Figure 59, record 1 of the input data is now interpreted as
the column mapping names to be used in loading the data.
Figure 59. Comma Delimited Flat File Data Load Rules
As part of the process of loading the flat files, multiple flat files can be
identified to be loaded. Note that each file should have the same format as
defined in the Load Rules. Thus, if you have used the first record of the input
data as the data load field names, you need to do this for each input file. You
can select multiple files by using the Shift key.
Implementing a Multidimensional Model 87
Figure 60. Data Load Using Flat Files
With flat files, the option of Interactive is available (see Figure 60). If an error
occurs during data loading, you are prompted for whether you want to stop
loading that input file and continue loading the next input file. At the
completion of the load process, you are provided with the success or failure
of each of the input files. Using multiple input files may be useful when you
load regional data and each region’s data is provided in a separate input file.
4.5 Calculating the Data
We have merely loaded data into the lowest level of the cube. Our cube
contains two types of data, values that are entered called input data, and
values that have been calculated from this input data.
Calculations can be defined at both the outline level as well as through a
calculation script. DB2 OLAP Server calculates the database based on the
relationships between members in the database outline (for example, the
hierarchy structures) and any formulas that have been attached to members
in the outline. A calculation script contains a series of calculation commands,
equations, and formulas.
88 Visual Warehouse & DB2 OLAP Server
For each database, we can build and use different calculation scripts, just as
we can build and use different Load Rules. There is also a default calculation
script that calculates the entire cube, and it is this that we will use.
Although we may have already associated calculations with specific members
in the outline, the calculation script enables more control and complexity,
such as using logic in calculations, calculating just part of the cube, and
calculating member formulas that are different from those in the outline. For
details see Chapter 6, “A Closer Look at Calculating the OLAP Database” on
page 135.
In the Essbase Application Manager window, select Database Simple =>
Database => Calculate => Default => OK (Figure 61).
Figure 61. Identifying the Calculation Script
As the calculation script runs, you see a progress indicator displayed on the
screen. At the completion of the calculation, the cube is available for analysis.
In Figure 61, the Database State indicates that data values have been
modified since the last calculation. If we did another calculation immediately
after the first calculation completed, the database state would indicate that no
data values have been modified since the last calculation. Be careful in
interpreting this message. You may think it means that nothing in the cube
requires calculation, but this is not the case. As we discuss in Chapter 6, “A
Closer Look at Calculating the OLAP Database” on page 135, a calculation
script may only calculate a portion of the cube. The message provided above
Implementing a Multidimensional Model 89
actually means that no values have changed in the database since the last
calculation, not that no values have changed since the last full cube
calculation.
We have now completed the process of building the first cube. To view the
cube, start a spreadsheet (such as Excel) with the DB2 OLAP Server Add-in,
and drill down into the cube as described in Chapter 12, “OLAP Analysis
Using the Spreadsheet Add-in” on page 269.
90 Visual Warehouse & DB2 OLAP Server
Chapter 5. Populating the Multidimensional Model
Now that we have implemented a multidimensional model by using DB2
OLAP Server, we want to take a closer look at populating it.
Note
In this book we focus on Visual Warehouse, DB2 OLAP Server, and
Hyperion Essbase integration. We do not cover the numerous features,
functions, and data access capabilities for heterogeneous relational and
nonrelational data sources, such as Oracle, Microsoft SQL Server,
Informix, IMS, and VSAM, that are available with IBM Visual Warehouse.
We focus on how Visual Warehouse functions, such as how Business Views
and Visual Warehouse Programs can be used to support the creation,
integration, and management of multidimensional structures within an
end-to-end data mart environment.
When we recall the overall architecture of a Business Intelligence solution for
multidimensional analysis discussed in 3.3, “The End-to-End Architecture of a
Business Intelligence Solution” on page 27, we can identify the following
major interfaces between the functions of Visual Warehouse and DB2 OLAP
Server:
• The launch tables that define the dimensions and that are used to load the
facts of the multidimensional model
• The programs used to load and calculate the multidimensional model
according to the specifications in the corresponding Data Load Rules and
calculation scripts
• The metadata that describes the structure and business meaning of the
items found in the multidimensional model
5.1 Preparing Launch Tables Using Visual Warehouse
Building a data warehouse or data mart containing the launch tables used to
dynamically define and load the dimensions and to load the facts of the
multidimensional model consists of the following major steps:
1. Initializing Visual Warehouse
2. Defining the data sources
3. Defining a target data warehouse
© Copyright IBM Corp. 1998 91
4. Defining Business Views
5. Promoting Business Views
6. Testing Business Views
7. Scheduling the Business Views
8. Promoting the Business Views to a production environment
5.1.1 Initializing Visual Warehouse
Visual Warehouse basically creates control tables in the control database
during the initialization process. This step is optional if a control database
with control tables already exists and can be used for the warehouse or data
mart to be built.
Click Start=>Programs=>Visual Warehouse=>Visual Warehouse=>Visual
Warehouse Initialization to start the initialization. Choose Server for
Initialization Type. Enter the Control Database name, Userid, and Password
as shown in Figure 62.
92 Visual Warehouse & DB2 OLAP Server
Figure 62. Visual Warehouse Initialization
The prerequisites for the initialization process are:
• The control database must be registered as an ODBC system data source.
• The user ID entered in the Initialization panel must have CONNECT,
BINDADD, and CREATE TABLE privileges on the control database.
5.1.2 Defining the Data Sources
Click Start=>Programs=>Visual Warehouse=>Visual Warehouse=>Visual
Warehouse Desktop to start the Visual Warehouse Desktop. Enter the
Server Hostname, Control Database name, User ID, and Password to log on
to Visual Warehouse as shown in Figure 63. Note that the User ID and
Password are case sensitive and that the User ID you enter should have
resource definition privilege in Visual Warehouse.
Populating the Multidimensional Model 93
Figure 63. Logging on to the Visual Warehouse Desktop
Click the Sources tab on the Visual Warehouse Desktop. Click File=>New to
define a new data source. Choose the appropriate source type from the
Resource Type widow.
For the TBC scenario discussed in 4.1, “Introduction to the TBC Sales Model”
on page 41, we will use flat file LAN data sources. So, we will define a new
data source called TBC Source data and define all the source flat files under
it. On the General tab of the Flat File LAN window, enter the name of the data
source, which is TBC Source data in our example, as shown in Figure 64. Note
that the business name entered here should be unique within the control
database which is in use. On the Agent tab, choose an agent site. In our
example, we are using the Default Visual Warehouse agent site, which is the
machine where Visual Warehouse is installed.
The flat file source could be located on a local disk drive, on a LAN drive, or
on a remote machine from which the file could be read through File Transfer
Protocol (FTP) copy. Note that the Visual Warehouse agent needs access to
the drive as a system process. On the Connection tab, click the Local file
radio button. If the data source requires an FTP copy, the Host name, user ID
94 Visual Warehouse & DB2 OLAP Server
and password should be entered under FTP copy. Visual Warehouse also
allows you to specify a preaccess command or a postaccess command to be
processed on the file. For instance, the data file could be renamed or deleted
after Visual Warehouse extraction. We will have all the source flat files for
TBC on a local hard disk in a directory called C:\TBC_FILES on the Visual
Warehouse Server. We now go through a step-by-step procedure to add the
Order file C:\TBC_FILES\OUTOR.TXT, to the data source, TBC Source data.
Refer to Figure 64.
Figure 64. Defining a Flat File Data Source
Click the New button on the Files tab to add a new file to the flat file data
source. Enter a fully qualified file name and a brief description for the file on
the General tab as shown in Figure 65. Note that even if we are defining the
source from a remote Visual Warehouse administrative client session, the
pathname of the file should be valid for the Visual Warehouse server.
Populating the Multidimensional Model 95
Figure 65. Defining a New File under a Flat File Data Source
Now, we need to choose an appropriate file type for OUTOR.TXT on the
Parameters tab of the File: New window. The following data file types are
supported in Visual Warehouse:
• Comma - where the fields are delimited by a comma (,)
• Fixed - where the fields have a fixed length and a known offset
• Tab - where the fields are separated by tabs
• Character - where the fields are separated by a certain special character
96 Visual Warehouse & DB2 OLAP Server
Figure 66 shows the layout of OUTOR.TXT. As shown, three fields in this file
are separated by a pipe character (|).
Figure 66. Sample Data from OUTOR.TXT
So, the appropriate file type for file OUTOR.TXT would be Character, where
the fields are delimited by |. Choose Character from the File Type list box
and enter | as the delimiter as shown in Figure 67.
If the first row of the data file contains column names, check the First row
contains column names option. For FTP copy, the file transfer mode should
be specified on this panel.
Populating the Multidimensional Model 97
Figure 67. Defining the File Type
To define a new field in the file, click the New button on the Fields tab. Enter
the field name, description, native data type, and length of the field for each
of the fields that need to be defined as shown in Figure 68. In Visual
Warehouse, only VARCHAR and NUMERIC data types are accepted.
NUMERIC data type can support both integer and decimal values.
Note
For decimal values, please ensure that the lngth includes enough space
for the decimal point and the sign.
Date values have to be read in as VARCHAR type and can then be converted
to a date value on their way to the target warehouse. For the Fixed file type,
an offset of the field in the file would have to be entered.
98 Visual Warehouse & DB2 OLAP Server
Figure 68. Defining a Data Field
At the end of this process of defining fields for OUTOR.TXT, the complete
table definition for the data file is created in Visual Warehouse.
Similarly, we will define file definitions for all the source files as shown in
Table 1.
Populating the Multidimensional Model 99
Table 1. Source File Characteristics for TBC Source Files
Name File Type Columns
REGIONS.TXT Tab Type Region, State, City, Zip Code
PRODGRP.TXT Tab Type Product group code, Product group
description
OUTUPC.TXT Tab Type Product class code, Product class
description
PRODUCT.TXT Tab Type Product size code, Product size
description
PRODCOST.PRN Comma Type Product size code, Unit cost
OUTCUSTI.TXT Tab Type Institutional customer code, City, State,
and Zip code
OUTCUSTW.TXT Tab Type Wholesale customer code, City, State,
and Zip code
OUTCUSTR.TXT Tab type Retail customer code, City, State, and
Zip code
OUTOR.TXT Character Type Customer code, Order date, and Order
Delimiter: | number
OUTIT.TXT Character Type Order number, Product size code, Dollar
Delimiter: | amount of purchase
DATE.TXT Character Type Month, Day, and Year for all dates in
Delimited: / calendar year 1997
Note that the ODBC driver, IWH_TEXT, should be set up to read the text files
as shown in Figure 69.
100 Visual Warehouse & DB2 OLAP Server
Figure 69. Setting up ODBC to Read Text Files
We now create a target data warehouse called TBC, create Business Views
to extract data from the data source TBC Source data, and populate the TBC
target data warehouse.
5.1.3 Defining the TBC Target Data Warehouse
Click on the Warehouses tab of the Visual Warehouse Desktop and select
File=>New to create a new target data warehouse. On the General tab, enter
the name for the target data warehouse, which is TBC. Next, choose an agent
on the Agent Sites tab. In our example, we use the Default Visual
Warehouse agent site for the target data warehouse as well.
On the Database tab, enter the name of the target warehouse database
( TBC_TGT), the database type, and a User ID and Password to access the
database as shown in Figure 70. The database specified here should have
been configured as an ODBC system data set name (DSN). The User ID
specified should have CONNECT, BINDADD, and CREATE TABLE privileges
on the target database.
Populating the Multidimensional Model 101
Figure 70. Defining a Target Warehouse Database
Now that you have created a target warehouse database, TBC_TGT, double-
click on warehouse TBC, and Visual Warehouse takes you to the Business
Views window.
5.1.4 Defining the Business Views for the Launch Tables
A Business View is a Visual Warehouse object that contains the definitions of
the transformations to be applied on the data as well as the scheduling
information specifying when the defined transformations have to be executed.
The output of a Business View may be a table in the data warehouse or a flat
file produced by a Visual Warehouse program. It is possible that a Business
View will not produce any new object, for example, a Business View definition
could consist of a program to invoke the DB2 RUNSTATS utility on an existing
table. Note that the term Business View can be used to mean both the
transformation process as well as the output of the transformation process.
It is important to know that Visual Warehouse maintains three distinct stages
of Business View development, namely, development, test, and production.
When the Business Views are created, they are always in the development
102 Visual Warehouse & DB2 OLAP Server
stage. Business Views can be promoted from development to test or from test
to production. They can also be demoted from production to test or from test
to development. No modifications are allowed when a Business View is in the
production stage. Minimal changes are allowed when it is in the test stage.
We now go through a step-by-step procedure to create a Business View that
will read data from flat file, OUTOR.TXT, and create a target table in the
target warehouse database, TBC_TGT.
Click File=>New from the Visual Warehouse Desktop to create a new
Business View. Enter a business name for the Business View, which is
Customer Orders Current Month, and select a data source, which is TBC
Source data. Now all the available tables or files in that data source show up.
Select C:\TBC_FILES\OUTOR.TXT and click the Add button to add it to the
Business View definition as shown in Figure 71.
Figure 71. Creating a Business View
Populating the Multidimensional Model 103
Note that you can only add one flat file as source for a Business View
because it is not possible to join the flat files. Each of the input files would be
extracted by different Business Views, and later these Business Views could
be joined to create derived Business Views. This constraint does not apply if
you are using relational tables as data sources.
When the Business View definition panel opens, click on the Add button,
choose OUTOR.TXT, select all the columns, and add them to the Business
View definition. Note that in our example all the source data files have
columns relevant only for the target warehouse to be built. If the file contains
columns that do not have any business relevance, those columns usually are
not added to the target warehouse.
You can create additional derived columns by using the Insert button, if
required. Visual Warehouse supports all DB2 data types. If the CHAR data
type is chosen, Visual Warehouse allows you to use a column length that is
different from the source.
You can also change the column definition by clicking the button next to the
column in the column mapping panel. For example, in the TBC
multidimensional model for sales analysis, discussed in 4.1, “Introduction to
the TBC Sales Model” on page 41, we are interested in analyzing only the
monthly data and not the daily data. The Order Date column in OUTOR.TXT
has a format of MM/DD/YYYY. The Year dimension in the TBC sales model
expects the date values to be in MON YY format. So, click the column
definition button for Order Date and redefine the column as shown in Figure
72. You can also change the name of the Order Date column to
ORDER_MON_YY to indicate that it no longer stores the day part of the date.
Visual Warehouse also allows you to define primary keys and foreign keys for
the target table.
104 Visual Warehouse & DB2 OLAP Server
Figure 72. Modifying the Column Definition
Now, on the Information tab (see Figure 73), we specify various parameters
for target table creation options. The target table is created when the
Business View is promoted from the development environment to the test
environment. Visual Warehouse automatically puts in the table name qualifier
and the target table name, which can be changed, if required.
Populating the Multidimensional Model 105
Figure 73. Specifying Table Creation Options for a Business View
Visual Warehouse provides two different Business View population type
options namely, Full Refresh and Append. If Full Refresh is specified, the
data gets cleared out before every Business View execution. For the
Customer Orders Current Month Business View, we will choose the Full
Refresh option because the data is going to be refreshed every month.
If the Append option is specified, data is appended to the existing data every
time the Business View executes. With this option, Visual Warehouse also
provides automatic maintenance of editions of Business Views. In our TBC
sales example, we will create a Business View called History of Orders, which
will have cumulative information of all the orders, all the ordered items across
all months in a year. We will specify 12 as the number of editions for the
History of Orders Business View. Visual Warehouse automatically adds an
edition variable, VWEDITION, as the first column of the Business View when
editions are specified. When the Business View is executed, Visual
Warehouse assigns an edition ID, which is unique within the Visual
Warehouse control database. When the maximum number of editions
specified is reached, Visual Warehouse cycles through and starts from
edition number 1 again. Specifying 0 as the number of editions is the same as
106 Visual Warehouse & DB2 OLAP Server
specifying Full Refresh. Note that the edition number is not the same as the
edition ID. The edition ID is Visual Warehouse’s internal representation of an
edition number.
On the Information tab, you can also specify whether the target table is
Visual Warehouse created or user-defined. In the case of a Visual Warehouse
created target table, the table is automatically dropped when the Business
View is demoted from the test to the development environment, which is not
the case with the user-defined target tables. For all the Business Views in the
TBC sales example, we will use only Visual Warehouse created target tables.
You can tell Visual Warehouse to delete the target table after it is used once
by checking the Transient data option on this tab. You can also specify how
Visual Warehouse should handle certain errors while extracting data from the
source, such as No Rows Returned or SQL Warning by clicking the
appropriate radio buttons on this panel.
Set the table creation options as shown in Figure 73 on page 106. At the end
of this process, Visual Warehouse automatically generates the data definition
language (DDL). You can view it by clicking the Create DDL button.
We have now defined a Business View to extract data from OUTOR.TXT.
Similarly, we will create other Business Views for the TBC example as shown
in Table 4 on page 113.
5.1.4.1 Defining Joins
Visual Warehouse allows you to define joins visually. In the TBC sales
example, we will define a composite Business View called All Products, which
combines information from three other Business Views, namely, Product
Groups, Product Classes, and Product Sizes.
Create a new Business View called All Products. Add the three source
Business Views. While you are adding the source columns, if two or more
columns have the same name, Visual Warehouse prompts for a unique name.
After adding all the columns, open the SQL tab to define the joins between
the sources. Visual Warehouse displays boxes for each of the sources
containing columns. Select PROD_CODE from the PROD_CLASS table.
Select PROD_GRP_CODE from the PROD_GROUP table, and click the Join
button. Similarly, join PROD_GRP_CODE and PROD_CODE from the
PROD_GROUPS and PROD_SIZES tables, respectively, as shown in Figure
74. Changes to the join options can be made by clicking the Options button.
The supported join types are Matching Values Join, Left Outer Join, Right
Outer Join, Full Outer Join, and Inner Join. For a detailed comparison of the
types of joins, see About Joins in the Visual Warehouse Online Help. You can
Populating the Multidimensional Model 107
choose an appropriate join expression as well. The default is the equal sign
(=). Visual Warehouse displays the details of the join that is marked by a
dotted circle at the bottom of the screen.
Visual Warehouse also has an autojoin feature that automatically joins the
fields with the same attributes from different tables. Star Join is also
supported in Visual Warehouse. However, this would work only if the source
tables have primary keys and foreign keys defined. Primary and foreign keys
can be defined in the Column Mapping tab. By default Visual Warehouse
chooses the table with the most foreign keys as the fact table. The choice for
the fact table can be changed, if required.
Figure 74. Defining Joins between the Source Tables
5.1.4.2 SQL Enhancements
Visual Warehouse automatically generates the SQL according to the column
definitions and the joins. Click the SQL button to display the SQL that Visual
Warehouse has generated. You can also change the SQL, if necessary.
108 Visual Warehouse & DB2 OLAP Server
For example, in the All Products Business View, we defined joins between
three Business View target tables, namely, IWH.PROD_GROUPS,
IWH.PROD_CLASSES, and IWH.PROD_SIZES. However, the product group
code, product class code, and product size code are not in the same format.
The product group code is in the format 100, 200, and so forth, whereas the
product class code is in the format 100-10, 100-20, and so forth, and the
product size code is in the format 100-10-01, 100-10-02, and so forth.
Therefore when you join PROD_GROUPS with PROD_CLASSES, you want
Visual Warehouse to compare only the first three characters. Similarly, when
you join PROD_CLASSES and PROD_SIZES, you should compare only the
first six characters.
Click the SQL button to view or modify the SQL. Change the SQL as shown in
Figure 75. Visual Warehouse also provides various operators, functions, and
constants to be used in the SQL statements. Once you change the SQL, it is
your sole responsibility to update it later when changes take place! For
instance, after modifying the SQL, if you change the column names in the
Business View, the changes would not be automatically reflected in the SQL.
You can, however, choose to go back to the original, Visual Warehouse
generated SQL by clicking the AutoGen button in the Modify SQL window.
To specify a where clause to define data filtering conditions, click the Where
button. To specify a group-by clause to aggregate the data in a certain
fashion, click the Group by button. Visual Warehouse provides all standard
SQL aggregation functions such as AVG, COUNT, MAX, MIN, and SUM, that
can be used in the column definitions.
Populating the Multidimensional Model 109
Figure 75. Modifying the Autogenerated SQL
5.2 Visual Warehouse Hints and Tips
Consider the following hints and tips when you work with Visual Warehouse
Business Views:
• The standard SQL functions shown in Visual Warehouse are specific to
DB2 UDB. Some of these functions may not work for some source
environments. For example, DB2 for OS/390 does not support some of the
functions that are supported in DB2 UDB.
• To quickly verify a Business View’s output, copy the SQL that is
autogenerated by Visual Warehouse and paste it into the DB2 Command
Center or a DB2 Command Window to execute. This may help to avoid
several iterations of promotions and demotions of the Business View. Note
that a Business View is executable only when it is in the test or production
environment and that major changes to the Business View are not allowed
when it is in test status.
110 Visual Warehouse & DB2 OLAP Server
• To copy a Business View to a new Business View use the Edit=>Copy...
option of Visual Warehouse. Business Views can be copied to the same
data warehouse or a different data warehouse. If the data source is
different from the original Business View, you can change the data source,
using the Edit=>Migrate option. However, Visual Warehouse does not
retain the column definitions once the source is changed.
• When a primary key is defined for a Business View, it is defined in the
target database. However, foreign keys are not defined in the target
database because, if the referential integrity (RI) constraints are
introduced in the database, they in turn introduce constraints in Visual
Warehouse to execute the Business Views in a particular order. Foreign
keys are primarily used to define Star Joins in Visual Warehouse.
• When a Business View is marked as transient, it does not show up in the
Operations Work In Progress window. Visual Warehouse only shows the
Business View that uses this transient data, but it executes the transient
Business View before it executes the nontransient Business View.
• You can also create Business Views from the Subjects tab of the Visual
Warehouse desktop by creating a new subject and defining Business
Views under that subject. However, there are some differences between
creating a Business View under a data warehouse and creating a
Business View under a subject. Table 2 and Table 3 depict the possibilities
that could exist with these two types of Business Views.
Table 2. Valid Combinations for Warehouse Business Views
Name Source Target Visual Warehouse SQL
Program
X X X X
X X X X
X X X
Table 3. Valid Combinations for Subject Business Views
Name Source Target Visual Warehouse SQL
Program
X X X X
X X X
Populating the Multidimensional Model 111
Name Source Target Visual Warehouse SQL
Program
X X
X X X
• To quickly find out the dependencies between the Business Views, click
View=>Tree=>Contribute To That Business View from the Business
Views tab as shown in Figure 76. This view clearly shows the source
Business Views that contribute to a particular Business View. You can use
this view to trace the source of a particular information in the target
warehouse, and to view the transformations applied to the information
before it reaches the target warehouse.
Figure 76. Viewing Dependencies among the Business Views
112 Visual Warehouse & DB2 OLAP Server
5.3 Business Views Used for the TBC Sales Model
Table 4 lists all of the Business Views that were used to build the launch
tables for the TBC sales model, ready to be loaded into the DB2 OLAP Server
or Hyperion Essbase database. The launch tables for the facts contain all the
measures grouped by the lowest level members in each of the dimensions
that are defined in the database outline.
Table 4. List of Business Views Created for the TBC Sales Data Mart
No. Business View Name Description
1 Regions Source: REGIONS.TXT
Business View that contains all zip codes, and
the associated city, state, and region where the
company sells products
Maps To: Market dimension in the TBC Simple
and Advanced models. This Business View is
used as a launch table for dynamically building
the Market dimension.
2 Institution Customers Source: OUTCUSTI.TXT
A transient Business View with the address and
ID of all customers belonging to an Institution.
The target table for this Business View is
dropped once the All Customers Business View
is created.
3 Retail Customers Source: OUTCUSTR.TXT
A transient Business View with the address and
ID of all retail customers. The target table for
this Business View is dropped once the All
Customers Business View is created.
4 Wholesale Customers Source: OUTCUSTW.TXT
A transient Business View with the address and
ID of all wholesale customers. The target table
for this Business View is dropped once the All
Customers Business View is created.
Populating the Multidimensional Model 113
No. Business View Name Description
5 All Customers Source: Business Views 2, 3, and 4
Business View with zip code, Customer ID, and
category of all customers. This Business View
combines information about Institution, Retail,
and Wholesale customers.
Maps To: Customer dimension in the TBC
Expanded model. This Business View is used
as a launch table for dynamically building the
Customer dimension.
6 Product Groups Source: PRODGRP.TXT
Business View with product group codes and
their descriptions
7 Product Classes Source: PRODUCT.TXT
Business View with product classification codes
and their descriptions
8 Product Sizes Source: OUTPC.TXT
Business View with product size codes and their
descriptions
9 All Products Source: Business Views 6, 7, and 8
Business View with all product sizes with their
corresponding product group and product class
information
Maps To: Product dimension in the TBC Simple
and Expanded models. This Business View is
used as a launch table for dynamically building
the Product dimension. The TBC Simple model
does not use the product size codes and
descriptions in this Business View.
10 Diet Products Source: Business Views 6, 7, and 8
Business View with product sizes, product
class, and product group information for all Diet
products (all product class codes with ’%-20’) of
TBC.
Maps To: Product dimension in the TBC Simple
and Expanded models. This Business View is
used as a launch table for dynamically building
the Diet Products hierarchy in the Product
dimension. The TBC Simple model does not
use the product size codes and descriptions in
this Business View.
114 Visual Warehouse & DB2 OLAP Server
No. Business View Name Description
11 Customer Orders Current Source: OUTORDER.TXT
Month Business View listing all purchases made by
each customer, and the date the purchase took
place
12 Ordered Items Current Source: OUTITEM.TXT
Month Business View with detailed sales records of
price amount for each delivery
13 Product Cost Source: PRODCOST.PRN
Business View with product size codes and their
unit costs
14 Ordered Items Expanded Source: Business Views 12 and 13
Business View with detailed sales records with
quantity, unit cost, and price amount for each
ordered item
15 History of Orders Source: Business Views 11, 13, and 14
Business View with all customer orders and all
ordered items with 12 editions, for each of the
months
16 Launch Table for Simple Source: Business Views 1, 5, 9, and 15
Business View with Sales, COGS, and Quantity
grouped by Order Month/Year, Product Class,
and City
Maps To: Measures dimension in the TBC
Simple model. This Business View is used as a
launch table for loading data into the Simple
model.
17 Dates Source: DATE.TXT
A transient Business View with month, day, and
year for all days in 1997. This Business View
target table will be dropped when the Time
dimension Business View is created.
18 Time Dimension Source: Business View 17
Business View with formatted month and day
information for all days in 1997
Maps To: Time dimension in the TBC Expanded
model. This Business View is used as a launch
table for building the Time dimension
dynamically in the TBC Expanded model.
Populating the Multidimensional Model 115
No. Business View Name Description
19 Launch Table for Source: Business Views 1, 5, 15, and 18
Advanced Business View with Sales, COGS, and Quantity
grouped by Order Month/Year, Product Size
code, zip code, and Customer code
Maps To: Measures dimension in the TBC
Expanded model. This Business View is used
as a launch table for loading data into the
Expanded model.
5.4 Automating the Process Using Visual Warehouse Programs
Up to this point, we have produced and structured all the necessary input for
launching the dimension build and for loading the data into the OLAP cube by
using SQL queries and result tables within Visual Warehouse Business
Views. Usually, this is not enough to build a population subsystem that is
capable of handling and automating all the tasks involved in building an
OLAP data mart from an end-to-end perspective.
That is why we want to take a closer look at another powerful concept
provided by Visual Warehouse, the Visual Warehouse Programs (VWPs).
5.4.1 Introduction to Visual Warehouse Programs
In general, the major purpose of VWPs is to provide an interface to integrate
and control the execution of all kinds of external programs. These external
programs could be user-written access logic to legacy data sources;
cleansing routines generated by, for example, Vality Integrity; ETI Extract
generated transformation programs; or database load routines.
VWPs are the most important feature of Visual Warehouse in that they enable
Visual Warehouse to be the control hub of all processes involved in building
the warehouse or the data mart(s).
VWPs are used in the following areas:
• Complex extractions and transformations
• Automation of the dynamic dimension building for the OLAP server
(Hyperion Essbase or DB2 OLAP Server)
• Automation of the periodic data load and calculation for the OLAP server
• Invocation of relational data loads
116 Visual Warehouse & DB2 OLAP Server
• Automation of routine database administration tasks (for example, the
DB2 REORG and RUNSTATS utilities)
• Dynamic metadata synchronization
VWPs can be implemented in various ways:
• Batch files, such as Windows NT command files, AIX scripts, or MVS JCL
• Programs written in an interpretive language like REXX
• Compiled programs written in a programming language such as C (.EXE)
• Dynamic link library functions (.DLL)
In our example, we show how to automate the process of loading the OLAP
cube from a launch table, as it would usually have to be done each time a
new reporting cycle is due. The same techniques can then be used to
automate calculation and regular maintenance activities (for example,
invoking the DB2 REORG and RUNSTATS utilities).
5.4.2 Understanding VWP Templates
Visual Warehouse comes with a set of generic VWP templates ready for use
in new Business Views. These templates solve some major problems such as
DB2 load, DB2 RUNSTATS, Essbase load, and Essbase calculation.
To see which VWP templates have been installed, select Visual Warehouse
Programs from the Definitions menu of the Visual Warehouse Desktop (see
Figure 77).
Figure 77. Visual Warehouse Programs
Populating the Multidimensional Model 117
The Visual Warehouse Programs window opens, showing all available VWP
templates (see Figure 78).
Figure 78. Available Visual Warehouse Program Templates
To open a VWP template, select one and click Edit... from the File menu.
For a new VWP template, supply its Business Name, the Program Group,
Description, Program Executable Type, and the program name as invoked
from the command line. Some of the fields are mandatory only for certain
executable types. For example, the dynamic link library (DLL) function name
is required only for a DLL program. Refer to the online help for more
information about the required fields for each program executable type.
118 Visual Warehouse & DB2 OLAP Server
Figure 79. Visual Warehouse Program Definition
Some pointers for the screen in Figure 79:
• The Program Group is useful in classifying and grouping similar VWP
types. The categories make it easier to choose the correct VWP template
during the Business View definition process. Think of meaningful group
names that incorporate the operating system, programming language, and
category such as Database Administration, Essbase, Metadata
Synchronization.
• The Fully Qualified Program Name is dependent on the operating system
of the agent site where the VWP resides. For example, a VWP on a
Windows NT agent site will have an entry of c:\VWPdir\VWPpgm.exe as
the Fully Qualified Program Name, whereas a VWP on an AIX agent site
will have an entry of /VWPdir/VWPpgm.exe.
Next select the Agent Sites tab (see Figure 80).
Populating the Multidimensional Model 119
Figure 80. Agent Site for VWPs
Select the agent site on which the VWP should run. Whenever possible,
choose the agent that resides close to the target, to reduce traffic on the
network.
Next click on the Parameters tab (see Figure 81). The list of parameters
expected by the VWP is displayed.
120 Visual Warehouse & DB2 OLAP Server
Figure 81. VWP Parameter Definition
For Visual Warehouse supplied programs for DB2 OLAP, the parameter list
cannot be changed. A pop-up window collects the values of the parameters
for a specific BV during the BV definition (see Figure 82).
Populating the Multidimensional Model 121
Figure 82. Visual Warehouse supplied VWPs - Parameter Popup Window
However, for a user-defined VWP you can define the list of parameters by
clicking the Insert button on the Parameters tab (see Figure 81). Visual
Warehouse inserts an empty row for you to enter the Parameter Name and
the Parameter Text. Should a text string be passed as a parameter to the
VWP, it must be enclosed in quotes.
When you update the parameters, the command line string is not updated
from the entries that are supplied until you click the Show button. However,
this is not mandatory, as Visual Warehouse will have captured the new
information. It will show the correct parameter list the next time you open this
tab. The Usage tab provides a list of Business Views that use this VWP (see
Figure 83).
122 Visual Warehouse & DB2 OLAP Server
Figure 83. VWP Usage
This short overview should give you an idea of how the VWP templates are
used. Generally it is not necessary to change any parameters on the
templates. You will change them when you create a Business View in which
the VWP is involved.
5.4.3 Defining a Business View That Uses a VWP
In this example we use an Essbase load program to demonstrate how to
create a Business View that uses a VWP template. The same technique can
be applied to define Business Views for automatic Essbase calculation
program and DB2 RUNSTATS utility activation.
To define a VWP Business View, select the Warehouses tab from the Visual
Warehouse Desktop window, click on the corresponding target warehouse,
and create a new Business View definition as shown in Figure 84 and Figure
85.
Populating the Multidimensional Model 123
Figure 84. Creating a New Business View for a VWP
Figure 85. Creating a New Business View for a VWP (continued)
124 Visual Warehouse & DB2 OLAP Server
Enter a Business Name, in our example Essbase load. Select the Program
Group and the Program Name from the corresponding drop-down lists.
From the Select Source list choose the launch table for the load.
If there are many VWPs, choose the correct Program Group before selecting
the Program Name. The list of program names corresponds to the list shown
in Figure 78 on page 118. Then click OK.
A pop up window will prompt you to provide the parameters necessary for the
execution of the selected VWP (see Figure 86).
Figure 86. Business View VWP Parameter Definition
In our example, we used the VWP ESSDATA3, which loads data from the
specified launch table into the DB2 OLAP Server or Hyperion Essbase
database using the specified Data Load Rules.
Table 5 shows the parameter list for ESSDATA3. The list includes the
predefined token for a parameter if one exists.
Populating the Multidimensional Model 125
Table 5. Parameters for Essbase Load VWP (ESSDATA3)
Description Predefined Comment
token
ESSBASE server name None Server where the DB2 OLAP
Server resides
ESSBASE application name None Name of the Essbase application
ESSBASE database name
ESSBASE user ID
ESSBASE password
Source file name (1) &STBN
Source Table/File Name
Source file location flag (2)
Load rules file name (1) The rules file created in Essbase
Load rules file location flag (2)
Essbase utility abort flag (3)
Notes: (1) The file name must follow the Essbase convention for specifying file
names on the client or server. For more information, see the Essbase
documentation. (2) Set this parameter to one of the following values: The file is on
the agent site, The file is on the Essbase server. (3) Set this parameter to one of the
following values: Abort the utility on error, Do not abort the utility on error.
You do not have to enter anything in the Column Mapping, SQL, and Source
Windows. This will all be done by the VWP. Click the Information tab (see
Figure 87).
126 Visual Warehouse & DB2 OLAP Server
Figure 87. VWP Information
Here you can enter the Business View and the Admin Contact and choose an
Update Security Group and the agent site on which the Business View runs.
When you select the agent site, select the agent that resides next to the
target.
The rest of the information is not available in this case. Click the Program
tab. On the Program tab (see Figure 88) you can select a Program Group
and a Program Name. However, you do not have to change anything here at
this point.
Populating the Multidimensional Model 127
Figure 88. Business View Program Definition
The Command Line String as specified in the VWP template is displayed
initially. For a specific VWP, the string defined in the template is complete
and there should be no need to alter any of the parameters.
5.4.3.1 Postprocessing VWPs
To have Visual Warehouse activate a VWP after processing a Business View,
select the Program button in the Schedule tag (see Figure 89). This will
activate the Cascade Program window (see Figure 90).
128 Visual Warehouse & DB2 OLAP Server
Figure 89. Scheduling a Business View
Choose the correct Program Group and select the Program Name. The list of
program names corresponds to the list in Figure 78 on page 118.
The rest of the information about resources and outputs are for
documentation purposes, as the VWP is responsible for its data processing.
To remove a VWP from a Business View, select <none> from the Program
Group or Program Name drop-down list.
The Command Line String as specified in the VWP template is displayed
initially. For a generic VWP, this parameter list should be modified for the
Business View. Click the Edit button to modify the parameter values.
Populating the Multidimensional Model 129
Figure 90. Visual Warehouse Cascade Program
5.4.3.2 Executing VWP Business Views
In Visual Warehouse, the Business View needs to be promoted from
development to test state before it can be executed. A cascading Business
View (this example) must be promoted to production. Otherwise the Business
View postprocessing will not work.
Select Promote to Test from the Status menu in the Business View window
(see Figure 91).
130 Visual Warehouse & DB2 OLAP Server
Figure 91. Business View Promotion
Repeat the process and promote the Business View to production.
To execute the Business View, go to the Work in Progress window (Figure
92).
Figure 92. Executing a Business View
Select New... (see Figure 93). The list of available Business Views is
displayed. Select the Essbase load Business View and click OK to execute it.
Populating the Multidimensional Model 131
Figure 93. Executing a Business View (continued)
Visual Warehouse will start the Business View as a background task and
display the Work in Progress pane, where you can monitor the execution of
the Business View. If the Business View fails, check the Work in Progress log
and/or the VWP’s own transaction logs to work out the problem and retry.
Some important points to note about testing and executing VWP Business
Views:
• A TRC###.LOG file will be created when a Business View that executes a
VWP is running. The file is written to the directory specified in the
environment variable VWS_LOGGING, which defaults to
\VWSWIN\LOGGING.
• Make sure that the VWP can get to the directories and files at the agent
site. Check on the PATH and LIBPATH environment variable settings and
the drive settings. Visual Warehouse can start these processes at the
agent sites as system or user processes.
132 Visual Warehouse & DB2 OLAP Server
• On the Windows NT platform, check the system variables and the user
variables in the environment section of the system settings (Control
Panel). This tends to be the key area of failure especially during VWP
testing between the Visual Warehouse manager and its agent site.
• Some programming languages set up the path statement in user variables
only. For example, Object REXX for Windows NT sets up the path
statement in the user variable section, and DB2 sets it up in the system
variable section only. Depending on whether they are started as user
processes or system processes, VWPs will fail if the program cannot find
either DB2 routines or Object REXX routines.
• Make sure that the password has not expired at the agent site.
• Ensure that the VWPs have read/write access to the required directories.
• Make sure that cascading Business Views and post-Business View VWPs
are in production status. Otherwise the Visual Warehouse scheduler will
not kick off the child Business Views or VWPs.
• Code the VWPs with a processing log such that the processing steps can
be traced during testing and production runs.
• During a DB2 load, the tablespace is locked. We recommend putting each
table into its own tablespace so that a load failure does not affect other
users of the DB2 system. This will also allow multiple load jobs to be
executed concurrently.
• Complete unit testing of the VWP outside Visual Warehouse and make
sure that it is working correctly before including it in the Business View.
Review the flow of events in the Visual Warehouse log to detect the
possible failure point.
• During testing, check on the system event logs or watch for console
messages. They may contain clues to the reason for failures.
• VWPs can also return processing codes to Visual Warehouse. This
appears as a processing error (RC1=8410 and RC2=error code from
VWPs).
5.4.4 Developing Custom VWP Templates
Some important points to consider before developing VWPs include:
• Choice of programming language - this will depend on the availability of
the programming language for the operating environment of the agent site
and the programming skills available.
Populating the Multidimensional Model 133
• When the data warehouse outgrows the database capacity, the target
warehouse may need to migrate to another platform. For example, you
may have to move the table from DB2 on Windows NT to DB2 on AIX. The
existing agent site can still be used, but the data has to move from the
source to the target through the agent site. For better performance, the
VWPs should also be migrated to the AIX platform. Some programming
effort is required, and the amount of effort will depend on the choice of the
programming language.
• Consider the effort to maintain the VWPs. Windows batch files require less
maintenance than a C program, but functions are limited to the operating
system commands. A REXX program is a compromise and requires no
compilation, whereas the C program will need to be compiled.
• The VWP’s execution speed
• The choice of the agent site to run the VWPs. A VWP requires machine
cycles at the agent site, and the movement of data has an impact on the
network traffic.
• The VWP should generate a run-time log so that operations can follow up
on run-time errors. The VWP cannot have any screen I/O processing.
Visual Warehouse only provides information about whether a VWP was
executed successfully or unsuccessfully.
• Design the VWP as a generic program so that it can be reused in multiple
Business Views. For example, a VWP that performs the DB2 RUNSTATS
operation can be supplied with the name of the table as a parameter.
Visual Warehouse comes with a set of generic C programs provided both
as samples and ready for use in new Business Views. Refer to the Visual
Warehouse product CD for the sample code as well as documentation on
how to use these programs in the Business Views.
• VWPs must be coded and tested like normal programs outside Visual
Warehouse. Only include well-tested programs in Visual Warehouse;
otherwise it is very hard to debug.
• Decide on whether the agent site will execute the VWP as a system
process or a user process in the Windows NT environment.
• If the VWP is a Windows NT batch file and calls another Windows batch
file, the second batch file will run asynchronously, and Visual Warehouse
gets a successful VWP execution return even though the second batch job
may not complete.
134 Visual Warehouse & DB2 OLAP Server
Chapter 6. A Closer Look at Calculating the OLAP Database
We can distinguish two different kinds of data in the OLAP database:
• Input data, which is entered into the database during the load process
• Aggregated data, which is calculated from the input data
In this chapter we look at how DB2 OLAP Server and Hyperion Essbase
calculate the database and explain what happens during default calculations
and what can be achieved with calculation scripts.
The default calculation, CALC ALL, calculates the database based on the
outline. If only a specific dimension must be calculated, a CALC dimension
does the job.
If there are calculations that differ from the logic defined in the outline,
calculation scripts can help to run customized calculations.
6.1 Formulas
The heart of most mathematical calculations is a formula. In DB2 OLAP
Server, formulas are used to calculate the relationship between different
members in the outline. There are two different application areas for
formulas:
1. As a member of a database outline
2. As part of a calculation script
Complex formulas can be defined with a variety of DB2 OLAP Server
functions. With the Outline and Calculation Script Editors we can construct
formulas using an expression builder.
6.2 Functions
Within the formulas we can use predefined functions to calculate the cube.
The provided functions can be categorized into the following groups:
Conditional Functions
Conditional functions are used to control the flow of events in formulas. With
IF/ELSE/ELSEIF/ENDIF functions we can control which formulas are
executed within a calculation.
© Copyright IBM Corp. 1998 135
Mathematical Functions
Mathematical functions (see Table 6) return a calculated value based on the
defined parameters. They are used to perform specific mathematical and
statistical calculations.
Table 6. Essbase Mathematical Functions
Function Description
@ABS absolute value
@AVG average of all values
@AVGRANGE average value across a specified range
@FACTORIAL returns the factorial of expression
@INT returns the next lowest integer value of an expression
@MAX returns the maximum value among the results of the
expressions
@MAXRANGE returns the maximum value of mbrName across a range of
members
@MIN returns the minimum value among the results of the
expressions
@MINRANGE returns the minimum value of mbrName across a range of
members
@MOD calculates the modulus of a division operation
@POWER returns the value of a specified member or expression raised
to power
@REMAINDER returns the remainder value of an expression
@ROUND rounds an expression to numDigits digits
@STDDEV returns the standard deviation of all values of the member list
@STDDEVRANGE returns the standard deviation of all values of the specified
members across the specified range
@SUM returns the summation of all values in the member list
@SUMRANGE returns the summation of all values of the specified member
across the specified range
@TRUNCATE removes the fractional part of the expression, returning the
integer
@VAR calculates the variance (difference) between two members
136 Visual Warehouse & DB2 OLAP Server
Function Description
@VARPER calculates the percent variance (difference) between two
members
Index Functions
Index functions (Table 7) are used to look up data within a database during a
calculation based on the current cell location and a series of indexing
parameters. With index functions we can refer to another value in a data
series. The default indexing range is the Time dimension. However, other
dimensions can be used.
Table 7. Essbase Index Functions
Function Description
@ANCESTVAL returns the ancestor values of a specified member
combination
@MDANCESTVAL returns ancestor-level data from multiple dimensions
based on the current member
@SANCESTVAL returns ancestor-level data based on the shared
ancestor value of the current member
@NEXT returns the nth cell value in the sequence rangeList
from mbrName
@PARENTVAL returns the parent values of a specified member
combination
@MDPARENTVAL returns parent-level data from multiple dimensions
based on the current member
@SPARENTVAL returns parent-level data based on the shared parent
value of the current member
@PRIOR returns the nth previous cell member from mbrName
in rangeList
@SHIFT returns the nth cell value in the sequence rangeList
from mbrName
@MDSHIFT shifts a series of data values across multiple
dimension ranges
@CURGE N returns the generation number of the current member
@CURLEV returns the level number of the current member
A Closer Look at Calculating the OLAP Database 137
Function Description
@GEN returns the generation number
@LEV returns the level number of the specified member
Financial Functions
Financial functions (Table 8) perform a number of specialized financial
calculations. They never return a value; rather they calculate a series of
values internally based on the range specified.
Table 8. Essbase Financial Functions
Function Description
@ACCUM accumulates the values of mbrName within rangeList,
up to the current member in the dimension
@COMPOUND compiles the proceeds of a compounded interest
calculation based on the balances of a specified
member at a specified rate across a specified range
@COMPOUNDGROWTH calculates a series of values that represent a
compound growth of the first nonzero value in the
specified member across a range of members
@DECLINE calculates the depreciation of an asset for a specific
period, using the declining balance method
@DISCOUNT calculates a value discounted by the specified rate,
from the first period of the range to the period in which
the amount to discount is found
@GROWTH calculates a series of values that represent a linear
growth of the first nonzero value encountered in
principalMbr across the range specified by rangeList
@INTEREST calculates the simple interest in balanceMbr at the rate
specified by creditrateMbrConst if the value specified
by balanceMbr is positive, or at the rate specified by
borrowrateMbrConst if balanceMbr is negative.
The interest is calculated for each time period
specified by rangeList.
@IRR calculates the Internal Rate of Return on a cash flow
that must contain at least one investment (negative)
and one income (positive) value
@NPV calculates the Net Present Value of an investment
based on the series of payments (negative values)
and income (positive values)
138 Visual Warehouse & DB2 OLAP Server
Function Description
@PTD calculates the period-to-date values of members in the
Time dimension
@SLN calculates the amount per period that an asset in the
current period may be depreciated, calculated across
a range of periods
@SYD calculates across a range of periods the amount per
period that an asset in the current period may be
depreciated. The depreciation method used is the sum
of the year's digits.
Macro Functions
Macro functions (Table 9) are used to generate a list of members based on a
specified member (not the current member) and the command used. When a
macro is called as part of a formula, the list of members is generated before
the calculation starts. As the list is based on the specified member, it never
varies and is independent of the current member.
Table 9. Essbase Macro Functions
Function Description
@ALLANCESTORS * expands the selection to include all ancestors of the
specified member, including ancestors of any
occurrences of the specified member as a shared
member
@ANCESTORS * expands the selection to include either all ancestors of
the specified member or those up to a specified
generation or level
@CHILDREN * expands to include all children of the specified member
@CURRMBRRANGE returns a member range list based on the relative
position of the current member being calculated
@DESCENDANTS * expands to include either all descendants of the
specified member or those down to a specified
generation or level
@GENMBRS expands the selection to include all members with the
specified generation number or generation name in the
specified dimension
A Closer Look at Calculating the OLAP Database 139
Function Description
@LEVMBRS expands the selection to include all members with the
specified level number or level name in the specified
dimension
@LSIBLINGS * expands to include all of the left siblings of the specified
member; left siblings are children that share the same
parent as the member and that precede the member in
the database outline.
@MATCH allows wildcard member selections
@RELATIVE selects all members at a specified generation or level
that are above or below a specified member
@SIBLINGS * expands to include all siblings of the member
@UDA selects members based on a common attribute, which
is defined as a User-Defined Attribute (UDA)
All macros marked with * exclude the specified member. To include the specified
member, use the macro preceded by a pipe (I); for example: @ICHILDREN
Boolean Functions
Boolean functions (Table 10) perform tests on a specified member, returning
either a TRUE (1) or FALSE (0) value. These functions are generally used in
conjunction with an IF function to provide a conditional test. Because they
generate a numeric value, they can also be used as part of a member
formula. Boolean functions are useful because they can determine which
formula to use based on characteristics of the current member combination.
Table 10. Essbase Boolean Functions
Function Description
@ISACCTYPE returns TRUE if the current member has the associated
accounts tag
@ISANCEST returns TRUE if a current member is an ancestor of the
specified member
@ISCHILD returns TRUE if a current member is a child of the specified
member
@ISICHILD returns TRUE if a current member is a child of the specified
member
140 Visual Warehouse & DB2 OLAP Server
Function Description
@ISIDESC returns TRUE if a current member is the principal member or a
descendant of the specified member
@ISIANCEST returns TRUE if a current member is the specified member or
an ancestor of the specified member
@ISGEN returns TRUE if the current member of the specified dimension
is of a certain generation
@ISLEV returns TRUE if the current member of the specified dimension
is of the specified level
@ISMBR returns TRUE if a current member matches any one of the
specified members
@ISIPARENT returns TRUE if a current member is the specified member or
the parent of the specified member
@ISPARENT returns TRUE if a current member is the parent of the specified
member
@ISSAMEGEN returns TRUE if the current member in the same dimension is
of the same generation as the specified member
@ISSAMELEV returns TRUE if the current member in the same dimension is
of the same level as the specified member
@ISISIBLING returns TRUE if a current member is the principal member or a
sibling of the specified member
@ISSIBLING returns TRUE if a current member is a sibling of the specified
member
@ISUDA returns TRUE if the specified user-defined attribute (UDA)
exists for the current member of the specified dimension at the
time of the calculation
Range Functions
Range functions declare a range of members as an argument to another
function or command. They are recognized as an optional parameter,
rangeList, as the last parameter.
Examples of valid range lists are:
• A single member
• A comma-delimited list of members
A Closer Look at Calculating the OLAP Database 141
• A level range; this includes all members on the same level between and
including the members defining the range
• A generation range; this includes the members defining the range and all
members within the range of the same generation
• A generation range and comma-delimited list
• A macro and other range lists
6.3 Calculation Commands
Calculation scripts are used to define calculations that differ from those
defined in the database outline. They can calculate either all or parts of a
database.
After a database is created, an internal default calculation script is created
and set to CALCULATE ALL, which means that it will calculate all dimensions
based on the database outline's hierarchical relationships and formulas.
However, you can use custom calculation scripts, to implement specialized
calculation requirements.
You can use such a custom calculation script to refer to calculation rules
defined in the database outline, or you can specify entirely new formulas,
calculation formats, and calculation orders.
Calculation script operations can be categorized as follows:
Data Declaration Commands
These commands are used to declare, and set the initial values of, temporary
variables (see Table 11). The values stored in a variable may not be directly
returned from, for example, a spreadsheet, because they only exist while the
calculation script is being processed. If you want to report these values, you
have to create members within the database outline or assign the values from
the variables to existing members.
Table 11. Essbase Data Declaration Commands
Command Description
ARRAY declares one-dimensional array variables
VAR declares a temporary variable that contains a single value
& prefaces a substitution variable in a calculation script
Control Flow Commands
Control flow commands (Table 12) are used to iterate a set of commands or
to restrict the command’s effect to a specified subset (partition) of the
142 Visual Warehouse & DB2 OLAP Server
database. These commands are used to control the calculation flow of a
calculation script. The FIX...ENDFIX command can be used to restrict a
calculation to a particular member or members; the LOOP...ENDLOOP
command lets you repeat commands.
Table 12. Essbase Control Flow Commands
Command Description
FIX...ENDFIX restricts database calculations to a subset of the database. All
commands nested between the FIX and ENDFIX statements
are restricted to the specified database subset.
LOOP...ENDLOOP specifies the number of times to iterate calculations. All
commands between the LOOP and ENDLOOP statements are
performed the number of times that you specify.
Computation Commands
Computation commands (Table 13) are used to perform operations such as
calculation, data copying, clearing data, and currency conversion.
Table 13. Essbase Computation Commands
Commands
AGG CALC ALL CALC AVERAGE
CALC DIM CALC FIRST CALC LAST
CALC TWOPASS CCONV CLEARBLOCK
CLEARDATA DATACOPY SET AGGMISSG
SET SET SET CLEARUPDATESTATUS
LOCKBLOCK CALCHASHTBL
SET SET SET
UPTOLOCAL UPDATECALC FRMLBOTTOMUP
SET NOTICE SET MSG SET CACHE
SET commands in a calculation script are procedural. The first occurrence of
a SET command in a calculation script stays in effect until the next
occurrence of the same SET command.
For a more detailed description of computation commands, see the Online
Technical Reference in the \DOCS directory of the OLAP Server installation.
A Closer Look at Calculating the OLAP Database 143
6.4 Building an Outline Formula
We use the Formula Editor to build the formulas. All formulas are pure ASCII
text. You can use the text editor of your choice to write a formula and then
paste it into the Formula Editor as an alternative.
The Formula Editor has a text editing pane, customized menus, a syntax
checker, point-and-click member selection, and function and macro
templates.
In the outline for the expanded TBC sales model (Figure 94) we have the
Scenario dimension, which includes two members, Variance and Variance%.
They contain the calculated difference and the difference percentage of the
Actual and Budget measures.
Figure 94. Outline for Scenario Dimension
Follow these steps to create the corresponding formulas:
1. Connect to the DB2 OLAP Server, select the TBC application and the
expanded TBC sales database.
2. Open the outline, expand the Scenario dimension, and select the Variance
member.
3. Open the Formula Editor by clicking the Formula icon (the icon with the =
sign). With the Formula Editor we can develop member formulas in an
integrated development environment.
4. Next click the Function icon. This opens the Function and Macro
Templates dialog box, which can be used to paste commands, functions,
and operators including sample arguments into formulas and calculation
scripts.
144 Visual Warehouse & DB2 OLAP Server
The Categories list box contains categories of commands, functions, and
operators like Math or Macros.
The Templates list box contains the valid commands, functions, or
operators for the category selected. Choose the specific template to use.
In our case we have to choose the Math category and the @VAR
template.
The Insert Arguments checkbox pastes the argument template into the
formula or calculation script. DB2 OLAP Server displays these arguments,
along with the function or macro, above the Insert Arguments checkbox
(see Figure 95).
Figure 95. Function and Macro Template
5. In the Editor Window we will see the pasted function including the
arguments:
@VAR(mbrName1,mbrName2)
6. Now we have to substitute mbrName1 and mbrName2 with the
appropriate arguments, in our case, Actual and Budget. This can be done
by overtyping or by pasting them from the Members List into the Editor
with a double-click.
7. To verify the formula click on the Verify icon of the Formula Editor. This
checks the syntax of the member formula (see Figure 96).
When the syntax checker finishes checking the formula, it displays the first
error (if it finds any) and a brief description of it in the message box at the
bottom of the editor.
Note: The syntax checker only validates the language (or syntax) of the
formula. The formula may use the correct syntax but still contain logic
(semantic) errors. In our case (see Figure 96) the end-of-line semicolon is
missing. We can correct that and save the formula.
A Closer Look at Calculating the OLAP Database 145
Figure 96. Verify a Formula
8. When you exit the Formula Editor, the formula created should appear in
the Variance member of the outline.
Use the same procedure to define the formula for Variance%.
6.5 Outline Calculations
The simplest, most efficient, and most frequently used calculation method is
the outline calculation. This calculation is based on the relationship of
members within the outline including the attached formulas. This calculation
normally aggregates and rolls up the data in a hierarchical manner and can
also include computed members, which are defined by a formula.
Essbase stores all data in blocks. One block exists for every sparse member
combination where at least one input value is present. In our example we
have the Product and Market dimensions set to sparse, and the Year,
Measures, and Scenario dimension set to dense. Figure 97 shows a data
block for this outline.
146 Visual Warehouse & DB2 OLAP Server
Figure 97. Essbase Data Block
Essbase creates one block for each unique combination of sparse dimension
members for which input data exists (see Figure 98). If we create, for
instance, a block for the combination 100-10 (Kool Cola) in Aspen, the
resulting data block contains all values for Measures, Year, and Scenario.
A Closer Look at Calculating the OLAP Database 147
Figure 98. Essbase Data Block Combination
The OLAP engine calculates the data in a specific order when the default
calculation, CALC ALL, is used for the database. All dense dimensions within
each block are calculated first. Therefore a data block is read into memory
and all the calculations required within this block are done before the next
block is calculated.
Within the dense dimensions the calculation sequence is:
• Dimensions tagged as Accounts or Time are calculated first.
• Other dense dimensions are calculated in the order in which they appear
in the outline.
• Level 0 values are calculated first in the order in which they appear in the
outline followed by level 1. This procedure is continued until all levels are
calculated.
• If there are formulas or unary operators like *, / , or % on members in the
outline, be careful and consider the calculation order. The result of these
operations may be overwritten by subsequent calculations.
148 Visual Warehouse & DB2 OLAP Server
The sparse dimensions follow the same rules as the dense dimensions. But
remember, each sparse combination requires a complete data block in
memory.
6.6 Two-Pass Calculations
DB2 OLAP Server/ Hyperion Essbase calculates the database whenever
possible in one pass. However, some calculations need more than one
calculation pass. Therefore the calculation engine has to bring data blocks
back into memory, do some calculations, and save them again.
In some cases there may be significant performance improvements when a
formula in the outline is tagged as Two Pass rather than repeating the
formula in a calculation script, but that applies only if all members of the Two
Pass calculation formula are contained in dense dimensions. DB2 OLAP
Server automatically recalculates all formulas tagged as Two Pass on the
dimensions tagged as Accounts dimension. The Two Pass tags can be
applied to any member in an outline, but they work only on dimensions that
have the Two Pass Calculation checkbox marked in the Database Settings.
The example in Table 14 shows why, in some cases, it is necessary to
calculate a formula twice (Two Pass). Following the calculation rules, the
Accounts dimension is calculated first. Then the Time dimension is calculated
and rolled up. So the previous results for Profit% are also calculated, but the
A Closer Look at Calculating the OLAP Database 149
result is not correct. As the formula that calculates the Profit% is tagged as
Two Pass, all the Profit% members are recalculated at a second pass.
Table 14. Two-Pass Calculation
First Pass
Accounts Jan Feb Mar Qtr1
Profit 100 100 100 300
Sales 1000 1000 1000 3000
Profit % (Two Pass) 10 10 10 30
Second Pass
Accounts Jan Feb Mar Qtr1
Profit 100 100 100 300
Sales 1000 1000 1000 3000
Profit % (Two Pass) 10 10 10 10
6.7 Intelligent Calculation
With Intelligent Calculation activated, DB2 OLAP Server and Hyperion
Essbase calculate only those data blocks that have been modified since the
last calculation. This could have a significant impact on the performance of
the calculation process, if only a few data elements are updated or the
database is only partially loaded.
However, the performance improvement that can be achieved with Intelligent
Calculation is a trade-off between the time saved due to calculating fewer
blocks and the time spent to check which blocks have to be calculated.
By default, Intelligent Calculation is activated. You can turn it off by using the
UPDATECALC setting in the ESSBASE.CFG file or the SET UPDATECALC
command in a calculation script.
Intelligent Calculation works on the data block level, not on the cell level. If
data is changed in a block, the whole block is marked as dirty and has to be
recalculated. DB2 OLAP Server marks a data block clean when the block is
calculated by a CALC ALL or CALC DIM command. Any other calculation will
not mark the block as clean unless the SET CLEARUPDATESTATUS
command is used. When using this command to mark calculated blocks as
150 Visual Warehouse & DB2 OLAP Server
clean, be very careful because a two-pass calculation or a concurrent
calculation may find this block marked clean and not recalculate it.
If Intelligent Calculation is active, it also allows you to continue with
interrupted calculations, that is, calculated blocks are marked as clean and
do not have to be recalculated when an interrupted calculation is restarted.
For detailed information and examples, see the Essbase Database
Administrator's Guide, SC26-9238.
6.8 Dynamic Calculations
In some situations it may be more efficient for overall performance to
calculate some member combinations on the fly when the data is retrieved,
instead of calculating the data during the calculation step during cube update
or maintenance. DB2 OLAP Server provides the possibility to specify which
members should be calculated dynamically.
There are two types of dynamic calculation:
• Dynamic calculation
• Dynamic calculation and store
To use dynamic calculation, the members to be calculated dynamically have
to be tagged in the outline. These members do not have to be precalculated.
Instead the data is calculated when the data values are requested by a
Spreadsheet Add-in, a Report Script, or any application using the Essbase
API. The computed values are discarded after use.
If the member is tagged as Dynamic Calc And Store, the data value for this
member is also calculated upon request in the same way as for Dynamic Calc
members. When Dynamic Calc And Store members are retrieved for the first
time, the results are written back into the database and subsequent retrievals
of the same data do not to require recalculation.
Normally any member in an outline can be tagged as Dynamic Calc or
Dynamic Calc and Store, except the following:
• Level 0 members that do not have a formula
• Label Only members
• Shared members
If a member is defined with Dynamic Calc or Dynamic Calc and Store, this is
indicated in the Application Manager Outline Editor (see Figure 99)
A Closer Look at Calculating the OLAP Database 151
Figure 99. Outline with Dynamic Calculation Defined
6.8.1 Dynamic Calculation Considerations
Dynamic calculation reduces the processing time for the CALC ALL
command, database restructure, backup, and disk space. However, the data
retrieval time may increase.
Dynamic calculation can be used for infrequently retrieved data or in
situations where the data changes frequently (for example, if the data is used
in budgeting applications). This option can also be used if the recalculation
time for the complete database (all dimensions and members) exceeds a
given time window.
Consider the following members as candidates for dynamic calculation:
• Dense dimension members
• Sparse members with complex formulas
• Upper-level sparse members that need frequent restructuring
• Upper-level sparse members with six or fewer children
• Two-pass members
Do not use a member tagged with Dynamic Calc or Dynamic Calc and Store
in a calculation script. For instance the following calculation script is valid
only if Qtr1 is not tagged as Dynamic Calc or Dynamic Calc and Store:
FIX (WEST,COLA)
Qtr1
ENDFIX
Do not make a dynamically calculated member the target of a formula
calculation. Because memory is not allocated to a dynamically calculated
value, the result of the following formula is not valid if Qtr1 is tagged as
Dynamic Calculation or Dynamic Calc and Store:
QTR1 = Jan 97 + Feb 97 + Mar 97;
152 Visual Warehouse & DB2 OLAP Server
If the same formula is included in a formula for Qtr1 in the outline, it is valid
(see Figure 100).
Figure 100. Outline Dynamic Calculation
Do not calculate a regular member from Dynamic Calc or Dynamic Calc and
Store children. DB2 OLAP Server has to calculate the child members during
the regular calculation process, in order to get the parent result, so there is no
reduction in the regular calculation time. If, for example, in Figure 100 the
Year has to be calculated and stored during regular calculation, all related
children (including Qtr1) would have to be calculated first.
To understand and estimate the impact on data retrieval time of using
Dynamic Calc or Dynamic Calc and Store, DB2 OLAP Server provides a
retrieval factor for every outline at saving time. This retrieval factor is the
number of blocks that has to be retrieved in order to calculate the most
expensive dynamically calculated data block. For a database with only
Dynamic Calc or Dynamic Calc and Store members on dense dimensions,
this factor is 1.
The higher the retrieval factor of the outline, the higher the impact on the data
retrieval time. However, the retrieval time also depends on how many
dynamically calculated fields are retrieved. In some cases using dynamically
calculated members may even reduce the retrieval time, because the
database and the index are smaller than they would be without dynamic
calculation. So the retrieval factor can only be an indicator, not an absolute
measure.
When saving an outline with Dynamic Calc or Dynamic Calc and Store
members, DB2 OLAP Server calculates an estimated retrieval factor for the
most expensive dynamically calculated data block. This value is stored in the
application’s Event Log file. DB2 OLAP Server also keeps track of the
number of tagged Dynamic Calc or Dynamic Calc and Store members in the
outline within this Log File. This information can be accessed in the following
way:
A Closer Look at Calculating the OLAP Database 153
In the Application Manger window choose Application=>View Event Log. In
the View Log File dialog Box, choose Display All or Date. This will show the
Logfile of the selected database.
6.8.2 Dynamic Calculation or Dynamic Calculation and Store
In most cases, using Dynamic Calculation instead of Dynamic Calc and Store
reduces the calculation time and disk space usage. However, there are
occasions where Dynamic Calc and Store is the right choice.
For sparse dimension members, in most cases, it is better to use only
Dynamic Calculation. When DB2 OLAP Server calculates a value for a
member combination, for example, Cola/Denver including a Dynamic
Calculation member, only the requested values within this block are
calculated. If the same operation is performed on a Dynamic Calc and Store
member, the whole data block has to be recalculated. Only retrieved data
blocks are stored back into the database. If a calculation needs an
intermediate block for the result of the calculation, those blocks are not stored
because the block was not allocated.
Use Dynamic Calc and Store for:
• Upper-level sparse dimension members with children on a remote
(partition) database
• Sparse dimension members with a complex formula causing expensive
calculations
• Upper-level sparse dimension members that are used very frequently
6.8.3 Effects of Dynamic Calculation
When DB2 OLAP Server calculates data with Dynamic Calculation members,
the calculation order is different from the regular calculation order (see Table
15). This may have implications for the calculation results of the database.
For detailed examples of how this affects the calculation, see the Essbase
Database Administration Guide, SC26-9238.
Table 15. Calculation Order
Regular Calculation Dynamic Calculation
1 Dimensions tagged as Accounts If a sparse dimension in the outline is
tagged as time series data
2 Dimensions tagged as Time Other sparse dimensions in the
sequence in which they appear in the
outline
154 Visual Warehouse & DB2 OLAP Server
Regular Calculation Dynamic Calculation
3 Other dense dimensions in the Dense time series calculations
sequence in which they appear in the
outline
4 Other sparse dimensions in the Dense dimensions tagged as Account
sequence in which they appear in the
outline
5 Two-pass calculations Dense dimensions tagged as Time
6 Remaining dense dimensions in the
sequence in which they appear in the
outline
7 Two-pass calculations
Data cannot be copied into a Dynamic Calc data value.
No Dynamic Calc or Dynamic Calc and Store member can be the target for a
currency conversion (CCONV) command.
During data load DB2 OLAP Server does not store any data in member
combinations containing Dynamic Calc or Dynamic Calc and Store members.
The values are skipped without any error message.
Dynamic Calc and Store members are exported only when they have been
calculated by a previous data retrieval.
If there is an attempt to calculate a Dynamic Calc or Dynamic Calc and Store
member in a calculation script, DB2 OLAP Server returns an error message.
If a Dynamic Calc member for a dense dimension is added to a database,
space for this member is not reserved in the data block. Consequently, there
is no need to restructure the database. Saving only the outline is much faster
than restructuring the database.
DB2 OLAP Server does not restructure the database after the following
operations:
• Add, delete, or move dense dimension Dynamic Calc members
• Change a dense dimension Dynamic Calc and Store member to a regular
member
• Change a sparse dimension Dynamic Calc or Dynamic Calc and Store
member to a regular member
A Closer Look at Calculating the OLAP Database 155
If a Dynamic Calc or Dynamic Calc and Store member for a sparse dimension
is added to a database, there is no change to the data block. There are only
updates of the database index. Restructuring the index is much faster than
restructuring the database.
DB2 OLAP Server restructures the database index only after the following
operations:
• Add, delete, or move sparse dimension Dynamic Calc or Dynamic Calc
and Store members
• Change a regular dense dimension member to a Dynamic Calc and Store
member
All other changes done on dimensions involving Dynamic Calc or Dynamic
Calc and Store members result in a complete restructuring of the database,
which can be an extensive and time-consuming task.
6.9 Creating a Calculation Script
For most of the database calculations the regular CALC ALL command will be
sufficient. However, there may be special cases where you need to control
how and in which sequence the data is calculated. With calculation scripts
you can define exactly how the database is calculated and customize
calculations to override the default calculations defined in the outline.
Calculation scripts can be useful to:
• Change the calculation order of dense and sparse dimensions
• Perform complex calculations that might need multiple iterations
• Perform two-pass calculations on non-Accounts dimensions
• Calculate only a subset of the model, using the FIX.. ENDFIX command
(see “Calculation Commands” on page 142)
• Use logic in the calculations (for example, IF ... ELSE ... ENDIF or LOOP
... END)
• Clear or copy data from a specific member
• Use and define temporary variables
6.9.1 Calculation Script Syntax
To write a calculation script there are several rules to follow. If you use the
Calc Script Editor to build a calculation script, use the syntax checker of the
Calc Script Editor to validate the correct syntax.
156 Visual Warehouse & DB2 OLAP Server
The following rules apply for calculation scripts:
• Each formula or calculation script command must end with a semicolon (;);
for example:
CALC DIM(Year,Measures);
• If a member name contains spaces or is the same as an operator name,
the member name must be enclosed within double quotes; for example:
"Jan 97"
• Each IF statement must be ended by an ENDIF statement. These
statements do not need a semicolon at the end.
• If you use IF .... ELSEIF statements nested with other IF ... ENDIF
statements, an ENDIF statement is needed for each ELSEIF statement as
well as for each IF statement.
• IF statements or independent formulas have to be enclosed in
parentheses to associate them with a specific member. For instance, in
the following example the formula is associated with the Commission
dimension:
Commission (IF (Sales > 100) commission = 50;ENDIF)
• Each FIX statement must be ended by an ENDFIX:
FIX (Budget,@Descendants(East))
CALC DIM (Year,Measures,Product)
ENDFIX
6.9.2 Using the Calc Script Editor
To create a calculation script we can use the Calc Script Editor, which is
launched from the Application Manager:
1. Open the Application Manager window and connect to the OLAP Server
application, in our example, TBC (see Figure 101).
Figure 101. Starting the Calc Script Editor
A Closer Look at Calculating the OLAP Database 157
2. Select the appropriate application and click the Calc Scripts button to
display any existing calculation scripts. As this is our first calculation
script, this box is empty.
3. Click New to get to the Calc Script Editor window, where you create the
calculation scripts. The operation of this editor is equivalent to that of the
Formula Editor (see “Building an Outline Formula” on page 144).
6.9.3 Grouping Formulas
When a calculation script is run, DB2 OLAP Server keeps track of the
calculation order of the dimensions for each pass through the database. This
information is written into the Event Log file and can be retrieved by choosing
Applications =>View Event Log in the Application Manager window.
To avoid unnecessary cycles through the database, be careful with
parentheses. For instance, use (Qtr1;Qtr2;Qtr3) instead of ((Qtr1;Qtr2;)Qtr3)
because the latter construct cycles twice through the database. Also use:
CALC DIM (Year, Measures)
instead of:
CALC DIM (Year)
CALC DIM (Measures)
6.9.4 Substitution Variables in Calculation Scripts
Substitution variables can be created on the server, application, and
database level. To define a substitution variable from the Application
Manager window, select Server=>Substitution Variables. In the
Substitution Variables window (see Figure 102), you can define the variables.
158 Visual Warehouse & DB2 OLAP Server
Figure 102. Substitution Variable Window
When you use substitution variables in calculation scripts, the & character
has to preface the variable name. DB2 OLAP Server checks for a leading &
and substitutes the content of the variable before it is passed to the
calculation script. If, for instance, the &CurMon variable has a value of May,
the FIX(&CurMon) statement is parsed and executed as FIX(May). This
technique helps to build flexible and generic calculation scripts.
A Closer Look at Calculating the OLAP Database 159
160 Visual Warehouse & DB2 OLAP Server
Chapter 7. Partitioning Multidimensional Databases
In this chapter we describe a useful technique for enhancing the scalability
and manageability of multidimensional databases. Both DB2 OLAP server
and Hyperion Essbase can divide a single logical multidimensional model into
physically separate parts, which can reside on different machines if
necessary.
This technique is called partitioning. With partitioning, the size of the
individual models, in terms of the number of dimensions, aggregation
hierarchies, and the number of facts, can be kept within reason, without
limiting the analysis capabilities for the users. In addition, the processing
power of SMP or MPP machines can be applied to the parts of the database
in parallel, reducing the overall time needed for loading and calculating the
database.
In a number of instances, we may want to partition the cube.
If we have two separate departments, we may want to partition the cube
horizontally, using replicated partition definitions, so that each department
uses its own cube, in isolation from other departments’ cubes. This is
especially useful for international and geographically distributed
environments.
If we have a very large cube where the load and calculate time is lengthy,
partitioning the cube, using transparent partition definitions, allows for
multiple processors to calculate each partition in parallel.
If we have separate independent cubes that represent different parts of the
business, we may partition the cubes, using linked partition definitions, to
enable analysis across the different business subject areas or processes.
There are three distinct methods of partitioning cubes in DB2 OLAP Server:
1. Replicated partitions, where data in the data source cube is replicated to
the data target cube
2. Transparent partitions, which allow you to transparently access the remote
partitions of the cube as though it were part of the local cube
3. Linked partitions, which allow you to drill-across from one independent
cube to another independent cube through a data cell, while keeping the
established context using the matching dimension definitions.
With each of these partition methods, there is a concept of area mapping, of
mapping the cube area in the source to the cube area in the target. In the
© Copyright IBM Corp. 1998 161
case of replicated and transparent partitions, these shared areas must be
mappable. The nonshared areas do not need to be mappable. Replicated and
transparent partitions can also synchronize the outline, to ensure that
mapped areas in each outline stay consistent.
Note
It is important to distinguish between a cube and a partition. A cube, in our
notation, is the full multidimensional model, whereas a partition is one or
more areas of the cube as defined in a specific partition definition.
With linked partitions, there are no restrictions on the type of partition that can
be linked as the partitions are independent. Note that for any two cubes,
there is only ever one linked partition definition in one direction, and one
linked partition definition in the other direction.
You can see in Figure 103 that the options for creating partitions enable very
flexible and complex models. For example, the regional models of the fashion
departments contain detail level data as well as summary data. These models
could be made available to a corporate model, representing the overall
fashion sales. In this example, only the summary level data is made available
to the corporate fashion sales model, using transparent partitions (see Figure
103, part 1). To get an overall view of the fashion line of business, the
corporate fashion model could contain not only the replicated summaries of
the retail data but also the locally available fashion wholesale data (see
Figure 103, part 2).
Note that it is not possible to derive, for example, a fashion retail summaries
model from the corporate fashion model by replicating the transparent retail
fashion summaries, because replication of transparent partitions is not
allowed (see Figure 103, part 3).
However, the locally available fashion wholesale part of the corporate fashion
model could be replicated to another location as shown in Figure 103, part 4.
The new replica of the fashion wholesale data could then be the source for a
transparent partition (for example, to provide the actuals for the fashion
wholesale planning model, as shown in Figure 103, part 5), or it could be
replicated again, to provide a corporate model for wholesale analysis across
all lines of business (see Figure 103, part 6).
As you can see, there are many ways of partitioning DB2 OLAP Server
cubes. The decision criteria to identify the best partitioning alternative are:
business requirements, design, operations, and performance.
162 Visual Warehouse & DB2 OLAP Server
transparent Fashion Corporate Fashion
Wholesale
Planning
Fashion Summary 1 4 5
Retail replicated transparent
Retail Summaries Actual
Detail
East
Fashion
2 Fashion Fashion
Wholesale Forecast
Wholesale Wholesale
Fashion
Wholesale
Fashion Summary
Retail replicated
Detail
Central 3 Fragrances
Wholesale 6
Fashion Summary Fashion
Retail
Retail Summaries Fragrances
Detail
West Home
Home replicated
Wholesale
Wholesale Corporate
All Lines of Business
Figure 103. Possible Configurations of Transparent and Replicated Partitions
Here is the basic process of creating a partition definition:
1. Define the source and target cubes.
2. Define the user IDs and passwords to be used for the source and target
cubes.
3. Define the source subcube area to be mapped to the target subcube area.
4. Define any source member name to target member name mapping within
the specified source to target subcube area, where the member meaning
is the same, but the member name in the source and target are different.
5. Validate the partition definition.
6. Save the partition definition.
Setting up partition definitions can be quite a lengthy and complicated
process, so do not assume that you can achieve a fully working partition
definition in a few minutes.
7.1 Replicated Partitions
The method used in replication partitioning is to identify an area to be
replicated. An area is essentially a cube or a subset of a cube that can be
identified.
In our example, the sales data is derived from a central host application and
database and provided for analysis purposes in the central sales model at the
corporate headquarters. To allow each sales region to perform analysis
Partitioning Multidimensional Databases 163
locally, the corresponding subset of the corporate sales model is replicated to
each region (for example, East).
We create a replicated partition going from the data source cube (called
Simprepl) to the data target cube (called Simpeast), which will contain just
the East Market data.
7.1.1 Rules
Replicated partitions must follow these rules:
• The shared replicated areas of the data source and data target outline
must be mappable.
• The nonshared replicated areas of the data source and data target outline
do not have to be mappable.
• It is not possible to create a replicated partition on top of a transparent
partition.
• Each cell in a data target can come from only one data source. Create one
replicated partition for each data source if multiple data sources are
required.
7.1.2 Implementing a Replicated Partition for the TBC Sales Model
We are already connected to the main server. Connect to the other DB2
OLAP Server. Note: A new application on the same server could also be used
instead. In creating partition definitions, both servers must be available at the
time of creating the definitions.
Create a new application, TBC_East, on the second server (in our case, the
second server is called Valencic) and create a new database and outline
called Simpeast as seen in Figure 104. Then open the Simprepl outline on
the other server (in our example, Thallium).
Now we have a new cube that is a subset of our original cube, with the
replicated cube having Eastern Region, rather than East as the Market name,
and with the source cube of Simprepl having an extra product group of
Mineral Water, which is not replicated to the Simpeast cube.
164 Visual Warehouse & DB2 OLAP Server
Figure 104. Data Source Outline and Data Target Outline
Now that we have the outline in place, we have to tell DB2 OLAP Server how
the mapping is to take place. Use the Partition Manager. Select Server
Thallium => Database Simprepl => Database => Partition Manager from
the Essbase Application Manager window (see Figure 105).
Partitioning Multidimensional Databases 165
Figure 105. Opening the Partition Manager from the Data Source Server
We have to define the source and target servers, applications, and
databases. Ensure that the Current Database field points to the data source
server, application, and database required, that is, Thallium, TBC, Simprepl,
respectively. Then select Partition => New (as shown in Figure 106).
Figure 106. Creating a New Partition Definition
166 Visual Warehouse & DB2 OLAP Server
Select the correct Data Target Server, Application, and Database, in this
case, Valencic, TBC_East, and Simpeast. Select Type Replicated for a
replicated type of partition.
Select Settings for the replication properties. Replication properties define
whether you are allowed to update the data target cube. If you update the
target cube in areas that are replicated from the source cube, those areas will
be overwritten at the next replication. Check the box The target partition
can be updated as seen in Figure 107.
Figure 107. Defining the Partition Type
Click OK. Click Next to move to the Admin tab. Enter the corresponding
Usernames and Passwords for the data source and data target (see Figure
108).
Partitioning Multidimensional Databases 167
Figure 108. Defining the Usernames and Passwords to Be Used in Replication
Click Next to move to the Areas tab. Here we enter the area mapping from
the data source to the data target; that is, what subcube of the source cube
should be replicated to the target cube (see Figure 109).
In our example, the source and target cubes are different, so we have to map
the shared areas. We have to review each of the dimensions in turn:
• The Measures dimension is exactly the same in each cube, so we do not
have to explicitly map it.
• The Product dimension differs between the two cubes, so we have to
explicitly map it. We have to map the shared area of the Product member
and each of the members that map across to the target cube.
• The Market dimension differs between the two cubes, so we have to
explicitly map it. We have to map the shared area of the Market member,
which maps the East hierarchy to the Eastern Region hierarchy.
• The Year dimension is exactly the same in each cube, so we do not have
to explicitly map it.
We have mapped the area in the source to the area in the target model. Thus,
we have the same cell count in both areas.
Check the Enable the Member Selection Tool option (Figure 109) to get to
the Member Selection window (shown in Figure 110). Point and click on the
subcube you are interested in, rather than manually typing in the subcube
168 Visual Warehouse & DB2 OLAP Server
names. Check the Show Cell Count option to view the cell count in both the
source and target cubes once you have selected a subcube area.
Figure 109. Mapping the Data Source Area to Data Target Area
Select the first cell under Source and click Edit....
Use the drop-down for Dimension to select Market (see Figure 110). The
members in the Market dimension are displayed. Select the East member
and click Add-> to add it to the Rules area. Note that this is all that is required
here. All descendants and members are implicitly defined.
Partitioning Multidimensional Databases 169
Figure 110. Defining the Replicated Area for the Source
On the same window, use the drop-down list box for Dimension to select and
add Product to the Rules.
With Product you have to identify the area of the Product dimension that you
would like to replicate to the target cube. You want to replicate all product
groups and their descendants other than product group 500 (Mineral Water).
There are a number of ways of doing this. In the case shown in Figure 111,
we explicitly selected each product group to be replicated, right-clicked on
each product group, selected All Descendants and Member, and selected
the Product member itself.
Another method would be to select Product => All Descendants and
Member => Subset and then identify members of the Product dimension for
inclusion or exclusion in replication, based on a set of constraints, such as
specific generations, levels, user-defined attributes (UDAs), or a pattern.
170 Visual Warehouse & DB2 OLAP Server
Figure 111. Defining the Replicated Area for the Source (continued)
Click OK. On the Areas tab of the Partition Wizard window, increase the size
of the cell by moving the horizontal line above the cursor as shown in Figure
112. As you can see, the area of the cube we are mapping is Measures, Year,
the East area of Market, and all of the Product dimension except for the
Mineral Water hierarchy.
Partitioning Multidimensional Databases 171
Figure 112. Defining the Replicated Area for the Source (continued)
One usability point to note: Instead of using the Member Selection Tool, you
can use Windows NT copy and paste facilities to copy the source subcube to
the target subcube. When you do this, the Enable the Member Selection
Tool checkbox should be unchecked.
Now we have to define the area mapping for the target cube. Select the cell
under Target (see Figure 112) and click Edit.... Add Eastern Region to the
Rules (Figure 113). Note that this implicitly includes all descendants and the
member itself.
172 Visual Warehouse & DB2 OLAP Server
Figure 113. Defining the Replicated Area for the Target
Add Product by clicking Product => All Descendants and Member => OK.
(Figure 114). You can do this here as the Product hierarchy in the target is a
subset of the Product hierarchy in the source cube.
Partitioning Multidimensional Databases 173
Figure 114. Defining the Replicated Area for the Target (continued)
For replicated partitions, the mapping that we have done has to have the
same number of cells in each area mapped. This information is provided in
Figure 115, as we have checked the Show Cell Count checkbox. If the areas
we had mapped differed between the source and target areas, the cell counts
would not be the same. Note that the DB2 OLAP Server Partition Manager
picks up any inconsistency in cell count when we validate the mapping later.
174 Visual Warehouse & DB2 OLAP Server
Figure 115. Defining the Replicated Area for Target (continued)
Click Next to move to the Mappings tab, which defines member mappings
between the source and the target where member names are not the same
between source and target for the area you mapped in the area mapping
step.
In our replication scenario, the East hierarchy in the source model and the
Eastern Region hierarchy in the target model represent the same hierarchy.
However, unless we identify this explicitly to DB2 OLAP Server, there is no
way of knowing this. Thus, we need to map the East member from the source
to the Eastern Region member in the target cube. Select the cell under
Source => Edit => "East", followed by selecting the cell under Target =>
Edit => "Eastern Region" as shown in Figure 116.
Partitioning Multidimensional Databases 175
Figure 116. Mapping the Source Member Name to the Target Member Name
The Advanced... button is for very specific area and member mapping
circumstances. It is used in cases where there is a different number of
dimensions on the source and target cubes and is discussed in 7.2,
“Transparent Partitions” on page 181.
Click Next to move to the Validate tab (see Figure 117). Select Validate to
validate the partition definition. Some of the validations DB2 OLAP Server
performs include confirming that the cell counts for the areas are the same for
replicated and transparent partitions, the source and target member names
are valid, and the user IDs and passwords are correct.
Error messages are presented in the Validation Results area. When you
double-click on a specific error message, the Partition Manager takes you to
the relevant screen to correct the error.
When the partition definition is validated, you see Validated in the left part of
the window as shown in Figure 117.
176 Visual Warehouse & DB2 OLAP Server
Figure 117. Validating the Partition Definition
Click Next to move to the Summary tab (Figure 118). This tab provides
information such as the data source and data target, the type of partition
definition, and how many areas have been defined.
Figure 118. Summary of the Partition Definition
Click Close to close the partition definition. You now have two options of
where to save the partition definition: at each of the data source and data
target servers, or at a client. When you Save to Servers, DB2 OLAP Server
Partitioning Multidimensional Databases 177
actually saves the partition definition at each server. Select Save to Servers
and click OK as shown in Figure 119.
Figure 119. Saving the Partition Definition to the Servers
We have now completed the replicated partition definition. As you can see
from Figure 120, the current database of Simprepl has only one associated
partition definition, which is a replicated partition to the Simpeast database.
Figure 120. Partition Manager Showing Existing Definitions for Database Simprepl
178 Visual Warehouse & DB2 OLAP Server
Now that we have created the partition definition, we actually want to
replicate the partition.
7.1.3 Replicating the Partition
Highlight the Simprepl database in the Application Manager window (Figure
121). This is our source data cube. Select Database => Replicate Data =>
To Targets....
Figure 121. Replicating the Partition
We now have the option of updating changed cells only or updating all cells
(see Figure 122). Because this is the first time we are replicating, all cells are
updated anyway. Select Update all cells and click Replicate.
In general, if you were replicating on a weekly basis, for example, we would
choose Update changed cells only, as it would be a lot quicker replicating
just the changed cells. DB2 OLAP Server tracks changes to each block and
so will only replicate the changed blocks.
The synchronization status tells you whether the outlines are synchronized
with each other.
Source change time and target change time are also provided. While the
update of the source cube is to be expected, the target change time is due to
allowing you to update the target partition (see Figure 107). Thus, the
Partitioning Multidimensional Databases 179
information provided allows you to see whether either cube has been updated
since the last replication occurred.
Figure 122. Selecting the Update Option for Replication
We have now replicated the cube, which we can confirm by viewing the data
through, for example, a spreadsheet (such as Excel). If we now update the
target cube, Simpeast, and want to perform the replication again (Figure
123), we have source and target not in synchronization, and the source and
target change times are different.
Select Update changed cells only and click Replicate to replicate from the
source to the target, for the data to be in synchronization. This will overwrite
the changes we have just made in Simpeast.
Figure 123. Replication after Target Database Has Been Updated
180 Visual Warehouse & DB2 OLAP Server
We have completed setting up a replicated partition and replicating the data
from our data source of Simprepl to our data target of Simpeast for the East
Market area.
7.2 Transparent Partitions
Transparent partitions allow you to access data from remote data sources as
though they were part of the data target. The data is, however, stored at the
data source, which can be in another application, in another DB2 OLAP
database, or in another DB2 OLAP Server.
7.2.1 Rules
Transparent partitions must follow these rules:
• The shared areas of the data source and data target do not have to be
identical. However, mappings have to be defined to tell DB2 OLAP Server
how each dimension and member in the data source maps to each
dimension and member in the data target.
• The nonshared areas in the data source and target need not be mapped.
• Transparent partitions can be created on a replicated partition.
• A transparent partition cannot be created from multiple partitions or data
sources.
7.2.2 Advantages and Disadvantages
The major advantages for transparent partitions are:
• Less disk space is needed compared to replicated partitions, because the
data is stored in one database.
• Data accessed from the data target is always the latest version, because
the data updated from the source or target is immediately written back to
the database.
• Because the individual databases are small, the calculations can be done
quickly.
• Data can be loaded either at the source or the target.
• The data load and calculation can be performed in parallel on the data
source and target.
The disadvantages for transparent partitions should be considered as well:
• Transparent partitions increase network activity. Therefore data retrieval
time is longer, because the data is transferred from the source to the
Partitioning Multidimensional Databases 181
target across the network. The data retrieval from source and target
databases is not performed in parallel.
• When more users access either the source or the target, the users at that
end have slower retrieval times.
• If the data source fails, users at both the source and target side are
affected.
• Administrative operations such as CLEAR DATA, EXPORT, VALIDATE,
BEGINARCHIVE, and ENDARCHIVE and restructure operations in
Application Manager cannot be performed on the target side of a
transparent partition.
Note
Implement transparent partitions on sparse dimensions only. This enables
DB2 OLAP Server to fetch entire (dense) data blocks as requested, instead
of having to fetch rows from blocks in different partitions on the fly.
7.2.3 Implementing Transparent Partitions for the TBC Sales Model
For our next example, consider that the western sales region of TBC wants to
compare its performance with the sales performance of the eastern region.
Thus, the eastern sales data should also be made available to the analysis
model of the western region.
We create two models, TBC_East and TBC_West, from the TBC sales model
and define a transparent partition so that the values on the East
multidimensional cube are accessible from the West cube. In other words,
TBC_East will be our data source for the transparent partition that we are
about to define, and TBC_West will be the data target. The Market dimension
will be the shared area between the source and target.
Create new applications called TBC_East and TBC_West. Next, create new
databases called TBC_East and TBC_West under the corresponding
applications. Open the outline for the expanded TBC sales model and save
the outline as TBC_East as shown in Figure 124.
182 Visual Warehouse & DB2 OLAP Server
Figure 124. Copying the Outline from TBC Expanded to TBC_East
Now, in the Market dimension, delete all regions except East as shown in
Figure 125.
Partitioning Multidimensional Databases 183
Figure 125. Outline for the TBC_East Model
Let us assume that only actual values are available in the TBC_East cube;
hence the Scenario dimension is not needed.
Similarly, copy the expanded TBC sales model outline over to TBC_West and
delete all regions in the Market dimension except East and West.
Because TBC_West is going to be the data target for the transparent
partition, we have to make a few changes in the outline that will have a direct
impact on the load and calculation performance on the target side. In the
expanded TBC model, we defined Market as dense dimensions, which is
retained when we copy the outline to TBC_East and TBC_West. Because we
do not want to define a transparent partition target along a dense dimension,
we will define Market as a sparse dimension.
Also, the upper-level blocks pertaining the Market dimension should be
calculated only when required. So, we will make Market as a Dynamic Calc
member so that it is calculated on the fly when its data is retrieved from the
data source. Note that when a calculation is performed on the target side,
DB2 OLAP Server uses current values of the local data and transparent
dependents. Transparent dependents are not recalculated while a calculation
is performed on the target side.
184 Visual Warehouse & DB2 OLAP Server
Note
If Market is defined as a stored member on the data target, the calculated
values for Market stay in the data blocks, even when the transparent
partition is deleted, until the target cube is recalculated.
Figure 126 shows the changed outline for TBC_West.
Figure 126. Outline for the TBC_West Model
Now we load data into the cubes and calculate them. Create a new Load Rule
for TBC_East to pull in data for the East region, which is stored in a launch
table called IWH.LAUNCH_FOR_EAST. This is the target table for the
Launch_for_adv_east Business View, managed by Visual Warehouse. Define
the SQL as shown in Figure 127. Load data into TBC_East and perform a
default calculation.
Similarly, create a Load Rule for TBC_West to pull in data from the launch
table IWH.LAUNCH_FOR_WEST, populate the cube, and perform a default
calculation as well.
We now proceed to define a transparent partition between the data source
and the data target. Create a new partition by clicking File=>New from the
Partition Manager window. The Partition Manager window can be reached by
selecting Database => Partition Manager... from the Application Manager
window.
Partitioning Multidimensional Databases 185
Figure 127. Defining the Load Rules for Loading Data into TBC_East
On the Connect tab of the Partition Wizard, set the partition type to
Transparent. Set TBC_East as the data source and TBC_West as the target
as shown in Figure 128. Click the Connect... button and connect to the
databases.
On the Admin tab, enter the user IDs and passwords that will be used as the
default login by DB2 OLAP Server while connecting to the source and target.
186 Visual Warehouse & DB2 OLAP Server
Figure 128. Defining a New Transparent Partition
Now, we have to define the transparent area between the source and the
target. Because the data source does not have a Scenario dimension and
contains only actual values, all the source values in the transparent area
have to be mapped to a particular member in the additional dimension on the
data target, which is Actual, in our example.
On the Areas tab, type in "East" on the source side and "East" "Actual" on
the target side as shown in Figure 129.
Note
When you choose "East" and "Actual," using the member selection tool for
the target side, it does not work. This error message results: "Incorrect
number of dimensions in request from the remote site."
The cell counts for the area definition would have to be the same for the
partition definition to be validated successfully.
Partitioning Multidimensional Databases 187
Figure 129. Defining the Area Mapping for the Transparent Partition
Now, when we go from the source to the target, the dimensionality is
incomplete, because we have an extra dimension, Scenario, in the target. On
the Mappings tab, click the Advanced... button to map the missing member,
Actual. For the area that we defined earlier, map Actual in the target to (void)
in the source as shown in Figure 130.
Figure 130. Area Specific Member Mapping
188 Visual Warehouse & DB2 OLAP Server
Note
The mapping of Actual in the target to (void) in the source can also be done
through the Mappings tab of Partition Wizard as shown in Figure 131.
Advanced area-specific member mapping can be used when you have
more than one area defined and need to control member mappings at the
area level. For example, say you have an All Years dimension in the data
source as well as the target, and the dimension has members 1997 and
1998. Assume that you want 1997 data on the source to be mapped to
1997 and Actual values on the target, and 1998 data on the source to be
mapped to 1998 and Budget on the target. You create two area definitions,
one mapping "East" "1997" to "East" "1997" "Actual," and one mapping
"East" "1998" to "East" "1998" "Budget." Mapping of Actual in the target to
(void) in the source is specific to the first area definition, and mapping of
Budget to (void) is specific to the second area definition. We can use
area-specific member mapping to do this.
Figure 131. Alternate Way of Area-Specific Member Mapping for Actual
Now that you have completed the partition definition, go to the Validate tab
and click the Validate button as shown in Figure 132.
Partitioning Multidimensional Databases 189
Figure 132. Validating the Transparent Partition Definition
Click the Close button and save the partition definition to the servers as
shown in Figure 133. Note that the partition definition is stored on both
servers, which in our example are physically the same.
Figure 133. Saving the Transparent Partition Definition
Now, open the Spreadsheet (for example, Excel) with the Essbase
Spreadsheet Add-in. Click Essbase=>Connect... and type in the password
to connect to the data target, TBC_West. Double-click on cell A1 to retrieve
the values. Drill down on Market. The values from TBC_East are
transparently seen from the TBC_West model (Figure 134).
190 Visual Warehouse & DB2 OLAP Server
Figure 134. Testing the Transparent Partition
Note that we did not load data for Budget in TBC_West.
We can load East data on either the source side or the target side. When
East data is loaded on the target side, it will be transmitted across and then
stored in the data source.
We can create four different cubes for each of the regions, East, West,
Central, and South, and use them as data sources to create a target cube,
which will transparently access East, West, Central, and South data. The
target cube will have all the regions defined in the Market dimension.
Therefore the data can be loaded and calculated in the smaller cubes and
consolidated in the data target by doing a calculation. Note that on the target
side, each of the transparent areas have to be from a single source. For
example, the East area in the data target will come from a single data source.
The transparent partition target, TBC_West, can be used as the data source
for another transparent partition. For example, you can create a new
Partitioning Multidimensional Databases 191
transparent partition, TBC_Central, which will then show values on cubes
TBC_Central, TBC_West, and TBC_East, although the data for each of the
regions is stored in physically separate locations. Thus, a large cube can be
partitioned to use multiple processors for load and calculation. The query
performance might be slower though, if you have a series of interconnected
transparent partitions. For example, when data is accessed from
TBC_Central, the data is accessed serially from TBC_East, TBC_West, and
then from TBC_Central.
The data source for a transparent partition can also be a replicated partition
(refer to Figure 103, part 5).
7.3 Linked Partitions
Defining two cubes as linked partitions in DB2 OLAP Server enables you to
link, or drill-across, from one cube to the other cube, even though they are
independent cubes.
7.3.1 Rules
Because all we are implementing with linked partitions is the ability to
drill-across from one cube to another cube, and each cube is totally
independent of the other, there are no constraints and thus no specific rules
for linked partitions.
7.3.2 Implementing Linked Partitions for the TBC Sales Model
Linked partitions are the preferred technique for modeling relationships
between different multidimensional models representing related business
subject areas or business processes.
Business analysts can navigate across these different models, using the
links, which define a starting point in one model and the associated entry
point in the other model. Ideally the context is carried over from one model to
the other, by means of standardized, common dimension definitions, shared
by the two models. This makes the drill-across operation more meaningful
and easier to understand for the users.
In our example, we want to enable analysis across the TBC sales model and
another model representing information about the inventory of TBC.
We will link the inventory cube to the TBC sales cube. Each link is a one-way
link, enabling you to go from the target cube to the source cube. However, if
we want to drill-across from either cube, we can define two links between any
192 Visual Warehouse & DB2 OLAP Server
two cubes, with the target cube being TBC inventory in one link partition
definition, and the target cube being TBC sales in the second link partition
definition.
DB2 OLAP Server’s identification of the source cube and the target cube is
not inherently the way we would expect it to work. Just remember that you
always start at the target cube to get to the source cube for linked partitions.
We first determine how we would link the cubes together by reviewing each of
the cube’s outlines at the dimension level. As you can see in Figure 135, the
dimensions that are common are Measures, Product, Market, and Year, with
the inventory cube having an additional dimension of Scenario. Thus, we
have a number of dimensions from which to choose.
Figure 135. Source and Target for the Linked Partitions
In our example, we would expect users of the inventory model to perform
product and inventory based analysis. Then, at a point in the analysis, the
user would want to review how the sales are going, based on Product, and so
to drill-across to the TBC sales model, based on the Product dimension.
Thus, we have chosen Product as our drill-across dimension.
We have decided that users should be able to drill-across at any level of the
Product dimension, and thus we have defined the mapping to be all members
Partitioning Multidimensional Databases 193
and descendants of Product. We could have chosen to allow drill-across only
at a specific Product cell, such as 100-10, but this would be unnecessarily
restrictive. We also could have provided multiple areas for drill-across, such
as specific Product members for 100-10 and Diet Cola members.
The other key difference that we can see between the outlines is that the TBC
inventory cube has an extra dimension of Scenario. What this means for our
linked partitions is that, for the drill-across to work correctly, we have to map
the inventory Scenario dimension to (void), telling DB2 OLAP Server that the
drill-across mapping is between the common dimensions in each cube.
7.3.2.1 Defining a Link from TBC Inventory to TBC Sales
Select source cube Invdb. Open the Partition Manager window by selecting
Database => Partition Manager from the Application Manager window. You
should see Invdb as the current database.
Select Partition => New to create a new partition (see Figure 136).
Figure 136. Creating a New Partition Definition
Select Partition Type of Linked as shown in Figure 137. Enter the
corresponding information for the source cube and target cube ( Thallium,
TBC, Simpperf and Thallium, TBC_INV, Invdb), then click Next to go the Admin
tab.
194 Visual Warehouse & DB2 OLAP Server
Figure 137. Defining the Partition Type
Enter the user ID and password for the source and target cubes. Then select
Next to go to the Areas tab.
In the Areas tab (Figure 138), we have to identify the areas in the source and
target to be linked. As we have determined earlier, we want to be able to
drill-across based on the Product dimension, at all levels of the hierarchy.
However, as discussed above, we also need to map the Scenario dimension
of Invdb to (void) in Simpperf.
In Figure 138, under Source, select the cell (not the row). Check Enable the
Member Selection Tool to use the GUI, as discussed in 7.2, “Transparent
Partitions” on page 181, click Edit... and select Product => All Descendants
and Member. Alternatively, do not check the Enable the Member Selection
Tool, click Edit..., and enter: @IDESCENDENTS("Product").
Under Target, select the cell (not the row) and click Edit.... Enter:
@IDESCENDENTS("Product") @IDESCENDENTS("Scenario").
With this mapping, we have identified to DB2 OLAP Server to drill-across to
the same hierarchy level of Product in the Simpperf cube, whenever we
select any hierarchy level of Product, or any hierarchy level of Scenario.
Partitioning Multidimensional Databases 195
Figure 138. Defining the Source and Target Area Mapping
Click Next to move to the Mappings tab.
The Mappings tab enables member mapping where the member name in the
target cube is different from the member name in the source cube. In this
case, we have to enter all Scenario members, as they do not exist in the
Simpperf cube.
Under Target Members, select the cell (not the row) and click Edit.... Enter
Actual and click OK (Figure 139).
We do not do anything in the Source cell, in order to let DB2 OLAP Server put
(void) in that cell.
196 Visual Warehouse & DB2 OLAP Server
.
Figure 139. Defining the Actual Member Mapping
Do the same for Budget, Variance, Variance %, and Scenario, to end up with
the member mapping shown in Figure 140. Note that Variance % is in quotes
because, whenever a member name has a blank in it, quotes are required so
that DB2 OLAP Server can identify it correctly.
Figure 140. Final Member Mapping
You have now mapped all the Scenario dimension members in Invdb to (void)
in the Simpperf cube.
Partitioning Multidimensional Databases 197
Click Close. From the Mappings tab, click Next and Validate to validate the
partition definition (Figure 141).
One word of caution. Even though a partition definition has been validated,
that does not mean that the drill-across will work smoothly.
Figure 141. Validating the Partition Definition
Click Next. Click Close and select Save to Servers (see Figure 142). Note
that with Save to Servers, the partition definition is written to each server.
Figure 142. Closing and Saving the Partition Definition
198 Visual Warehouse & DB2 OLAP Server
Next, we want to validate that the partition definition is working as required.
Open the spreadsheet (for example, Excel) and select Essbase =>
Connect... to connect to the server, application and database (Thallium,
TBC_Inv and Invdb respectively). Remember, we need to connect to target
cube Invdb, so we can drill-across to source cube Simpperf.
Drill down into the cube to select the cell East, Cola, Actual, Year, and
Measures. Select the numeric value, then select Essbase => Linked
Objects as shown in Figure 143.
Figure 143. Initiating the Link between Target and Source Cube in the Spreadsheet
As you can see in Figure 144, the cell we have initiated the link from is
displayed at the top of the window, as Year, Cola, East, Actual, and Measures,
which is in the order of the Invdb outline. Below that, the possible links to
which you can link are shown. Note that there is actually no detail available
about the cube to which you are about to link.
Click View/Launch to drill-across to the Simpperf cube.
Partitioning Multidimensional Databases 199
Figure 144. Selecting the Partition Link to Be Performed
As you can see in Figure 145, we have an error indicating that the sheet
contains an unknown member of Opening Inventory. If we review the outlines
in Figure 135, in fact we have not identified the Measures differences, so DB2
OLAP Server does not recognize what to do about these members.
Click Yes to continue.
200 Visual Warehouse & DB2 OLAP Server
Figure 145. DB2 OLAP Server Error Showing Full Links Are Not in Place
As you can see in Figure 146, the spreadsheet for Simpperf is displayed,
showing Measures, Year, Cola, and East. Thus, the drill-across positions the
cube at the hierarchy and cell level required.
From here you can drill up and down within Simpperf as required. The original
drill-across starting point is merely an entry point to move from the Invdb
cube to the Simpperf cube.
Partitioning Multidimensional Databases 201
Figure 146. Spreadsheet for the Linked-to Cube
We now want to fix our linked partition definition so we can drill-across from
any cell in Invdb to the Simpperf cube.
We have to map each member that exists in Invdb but does not exist in
Simpperf. We do not have to map those members that exist in Simpperf but
do not exist in Invdb, as we will not be linking to them in the drill-across.
Basically, any cell that we may be linking from must be mapped in the linked
partition definition.
If we review the outlines carefully, we find that as well as the Measures
members in Invdb needing to be mapped, we have some Year members that
have to be mapped. For example, in Invdb we have Qtr1, and in Simpperf we
have Qtr1 97. Thus these must be mapped.
As you can see in Figure 147, we map all of the Measures members in Invdb
to the Measures members in Simpperf. Therefore when we link from a cell for
Additions in the TBC inventory database, at the drill-across, DB2 OLAP
202 Visual Warehouse & DB2 OLAP Server
Server will map Additions to Measures and display Measures at the
drill-across to Simpperf.
Figure 147. Altering the Partition Definition Member Mapping
We now have to map the Year members. Rather than explicitly entering them,
we have created a text file, with the source to target member mapping.
Select Import... from the Partition Wizard and enter the text file name (see
Figure 148). Select whether source members or target members are first in
the text file, and click OK.
Partitioning Multidimensional Databases 203
Figure 148. Importing the Member Mapping from a Text File
As you can see in Figure 149, we now have all our mappings defined.
Figure 149. Result of Import of the Member Mapping File
Figure 150 shows that we can now drill-across from any cell in Invdb to the
Simpperf cube.
204 Visual Warehouse & DB2 OLAP Server
Figure 150. Link to Cube Working Correctly
Note that there can be only one linked partition definition between a specific
source and target cube. If we need to create a two-way link, we create a new
partition definition, with the source and target cubes now reversed.
7.3.2.2 Defining a Link from TBC Sales to TBC Inventory
The two models are only linked in one way. Therefore users can only navigate
from the inventory model across to the sales model, without having the option
to return to the inventory model. Usually it makes sense to provide links for
both directions.
For the alternate link between Invdb and Simpperf, the new linked partition
definition would have Simpperf as the target cube and Invdb as the source
cube.
In creating this second partition definition, because the target cube is now
Simpperf, drilling across to Invdb, the area mapping and the member
mapping must be defined differently.
Figure 151 shows the area mapping, going from the Simpperf Product
hierarchy to the Invdb Product hierarchy for the Scenario dimension Actual
member. Thus, at the drill-across, the Actual member will always be shown.
Partitioning Multidimensional Databases 205
Figure 151. Area Mapping for Link Partitions for Reverse Link
Figure 152 shows the member mapping. In this case, we have to map the
Simpperf members that do not exist in Invdb. Thus, the Simpperf Measures
members are mapped to Measures members in Invdb. We have to map the
Year member names in Simpperf to those in Invdb as before.
We also have to explicitly map the Actual member in Invdb to (void) in
Simpperf, as we are explicitly using it in our area mapping, and DB2 OLAP
Server needs to be able to map a member from Simpperf to it.
206 Visual Warehouse & DB2 OLAP Server
Figure 152. Member Mapping for Link Partitions for Reverse Link
Thus, with these two sets of linked partition definitions, we can drill-across
from Invdb to Simpperf, and from Simpperf to Invdb. Note that, each time we
drill-across, a new connection to DB2 OLAP Server and a new spreadsheet
are created. Also there is no requirement for outline synchronization, as there
is in replicated and transparent partitions.
In the example above, the two cubes are relatively independent. However,
another very common example is where we link two cubes that have almost
identical outlines, but one cube has more detail. For example, the source
cube may have daily data, and the target cube may have monthly aggregated
data. In this case, the amount of member mapping required would be
minimal.
In general, to minimize the amount of administration required in setting up
linked partition definitions, we recommend that you:
• Use the same member names in each cube as much as possible.
• Rather than allowing drill-across from every cell as we have shown here,
let users identify the key member combinations from which they are likely
to want to drill-across, and use these combinations as the member
mapping area.
• Provide drill-across from only higher levels in the hierarchy.
• Map members to the generation one member name in each dimension for
those members that do not have an equivalent in the source.
Partitioning Multidimensional Databases 207
Using these recommendations will minimize the amount of administration
required, as there is no method of performing a dynamic build for partition
definitions.
208 Visual Warehouse & DB2 OLAP Server
Part 2. Managing an OLAP Data Mart
In part 2 we focus on topics that are relevant later in the lifecycle of the
Business Intelligence solution, during the production phase. We cover
ongoing maintenance and updates of the models, performance and tuning
considerations, and problem determination and security.
© Copyright IBM Corp. 1998 209
210 Visual Warehouse & DB2 OLAP Server
Chapter 8. Ongoing Maintenance and Updates of the Cube
Now that we have defined a multidimensional model, loaded it with an initial
set of data, and calculated the necessary aggregation hierarchies, we want to
investigate the processes involved when additional data has to be loaded into
the cube, for example, on a monthly basis, during the production phase in the
lifecycle of the Business Intelligence solution.
8.1 Cleaning Up before Loading the Model
If we want to perform a full refresh of the cube each time we populate the
cube, we may want to clear the data first before loading. We show in 4.4.1,
“Loading Data from a Table” on page 76, how the model can be cleared
manually. However, clearing the data in the model can also be set up in the
Load Rules.
After opening the Data Load Rules, select the Set the global data load
attributes icon and go to the Clear Data Combinations page of the Data
Load Settings window (see Figure 153). Within the Clear Data
Combinations page, the combination of members whose data values are to
be cleared before loading data can be specified. The data values then appear
as #MISSING.
Select Outline to display the dimensions, then select Product => 100-20 =>
Add. Note that you must highlighted a member in the Members list box before
you can use the Add button. Note also that in a combination of members, you
must separate the members with a comma.
With the clearing of data, DB2 OLAP Server also considers dimensions as
normal members, so you can also clear dimension combinations.
© Copyright IBM Corp. 1998 211
Figure 153. Clearing Member Combination Data Values before Loading
Because the Measures dimension is just like any other dimension and
includes the numerical values, if all Measures bottom level members are
cleared, all of the data values will be cleared before reloading. As shown in
Figure 154, if you overwrite existing values, the explicit clearing of data is not
required. For example, if we have data for East, Kool Cola, Jan 97, and Sales
in the cube, and the input data being loaded is a later version, or a corrected
version, the value for East, Kool Cola, Jan 97, and Sales will be overwritten
by the new value.
If we already have in the cube data values for product 100-10 in Denver for
Jan 97, and we also have this same data combination in the new input data,
should the data values be overwritten, added to existing values, or subtracted
from existing values (Figure 154)?
212 Visual Warehouse & DB2 OLAP Server
Figure 154. Defining How to Add New Data to the Cube
Deciding on which option is appropriate is based on how granular the cube is,
how often the load is performed, and how the input data is defined.
What could we do, however, if we do not have all the data for region East for
example? In that case, we could load the data later when it comes in.
Alternatively, we could load or manually enter the data at the region level, if
required, with the states and cities underneath having #MISSING data. If this
is a frequent occurrence, then perhaps the cube could contain an Estimate
member. This would allow users to at least be able to do region-based
analysis. When the actual data arrives, the data is loaded at the member
level, and, after recalculation, the cube is up-to-date again.
If the cube contains Dynamic Calc and Store members, a load of data or a
spreadsheet update of the data or of its dependent children will not mark any
stored Dynamic Calc and Store data as requiring recalculation. DB2 OLAP
Server recognizes that a stored Dynamic Calc and Store member needs
recalculation only at the next regular batch calculation, at a database
restructure, and at the use of the CLEARBLOCK DYNAMIC command, and
marks the data block as requiring recalculation. Then, at the next user
requested retrieval of data, the requested data is dynamically calculated and
stored, thus bringing the data back into synchronization.
Ongoing Maintenance and Updates of the Cube 213
8.2 Changing the Outline with Dynamic Dimension Build
During a data load, if the input data includes any new members, that input
data row is not loaded and is written to the data load error log. This can
actually be a valuable tool in determining how the dimensions have changed
in the actual data, without actually dynamically building the dimensions at
each load. Thus, outline changes can be more tightly controlled in a
production environment.
If dynamic dimension build is always performed before the regular data load,
the outline changes that have been made as a result of each dynamic
dimension build could be tracked with:
OUTLINECHANGELOG TRUE
in the ESSBASE.CFG file in the \Essbase\Bin directory. If ESSBASE.CFG
does not exist, then create it. This then enables the tracking of outline
changes across all databases in the DB2 OLAP installation and creates a file
called databasename.OLG in the application database directory (Figure 155).
Figure 155. Log Showing Changes in the Outline
8.3 Considerations for Dynamic Dimension Build
Typically, a cube’s dimensions stay relatively static. The addition of a new
dimension is effectively a new design, and potentially, a new application. If
214 Visual Warehouse & DB2 OLAP Server
there is the requirement to add a new dimension, there are a number of
issues to consider if you would like to keep the existing data in the cube.
Take care when adding new dimensions or facts to an outline to ensure that
the Load Rules are correct. The input data to dimension and member name
mapping can become invalid, and even when it looks correct on the screen it
may not be. Be careful to check the data file attributes for the field edits that
have occurred and which DB2 OLAP Server will use in moving around the
input data fields. Our advice is, to keep Load Rules as simple as possible,
and to do most of the transformation and cleansing work before the data is
passed on to the multidimensional system (for example, using Visual
Warehouse Business Views to build launch tables). This increases the
maintainability of the solution.
If you add a new dimension with members in it, DB2 OLAP Server needs to
do a restructure. As part of the restructure, if the data in the cube is not being
discarded, DB2 OLAP Server prompts for the new member name with which
the existing data in the cube should be associated. If, after this, a calculation
is performed, and the cube is viewed through, for example, the spreadsheet
Add-in, it may initially appear as if all the data has disappeared, with
#MISSING occurring. However, drilling down into the new dimension and the
new member with which the cube’s data has been associated reveals that the
data is there. It is just that all the data is now associated with only one
member in the newly added dimension.
If an existing member in an existing dimension is changed to have lower-level
members in it, the data stays with the existing member name and the lower-
level members have #MISSING data in them. During the next load of the
data, the lower-level members are updated, and at that point the existing
member is also changed to reflect the lower-level member’s total data.
The impacts of a change in the outline are well documented in the Essbase
Database Administrator’s Guide, SC26-9238. The changes to the DB2 fact
table and the key table are well documented in Using DB2 OLAP Server,
SC26-9235. Review these documents before making any major changes to a
cube, as the changes can have major implications for a production cube.
In general, changes performed to sparse dimension definitions have less
impact than changes to dense dimension definitions. For example, adding a
member to a sparse dimension requires a restructure of the index file,
whereas adding a member to a dense dimension requires a restructure of
both the index and the data blocks.
Ongoing Maintenance and Updates of the Cube 215
The addition of Dynamic Calc and Store members in a dense dimension
requires block size changes and thus restructuring. Storing the Dynamic Calc
and Store data when it is calculated has no impact, as DB2 OLAP Server
allocates space in the data block for a Dynamic Calc and Store member, as if
it is a normally stored member.
For Dynamic Calc and Store members of sparse dimensions, a new index
entry is created, and the data blocks associated with it are created at the time
the data is requested.
One method of controlling the outlines in cubes is to implement a centrally
administered dimension cube that has all of the corporate dimension
hierarchies and which is used as the basis for all outlines in the corporation.
This method can be considered the equivalent of a data administration
process and will thus minimize ongoing administration of different names in
different outlines (see also “Common Data” on page 30).
8.4 Backup of the Cube
Ideally, before any load and calculation process, a DB2 OLAP Server archive
and a DB2 database backup should be performed, to ensure the ability to
restore the multidimensional model. In both the loading and calculation of the
cube, logging is performed by the underlying DB2 database, with commits
being taken based on the Database => Settings => Transaction => Commit
Block/Commit Row settings. If, during a load or calculate, the operation fails
in any way, the database changes are rolled back to the last commit point.
Commit Blocks and Commit Rows specify either how many blocks or how
many rows can be modified before a commit is performed, depending on
which of the two thresholds is reached first. The settings are valid only if
Uncommitted Access is checked.
If Committed Access is checked, each operation is considered a
transaction. Thus, a full data load would only have one commit at the end of
the load, or at the end of a calculation.
The use of Uncommitted Access, with the setting for Commit Rows and
Commit Blocks, enables the setting of the number and size of DB2 logs that
are required, as well as, in the case of a failure, how long the process will
take to perform a DB2 rollback to the last commit point.
For the DB2 OLAP Server archive, the BEGINARCHIVE command commits
any modified data blocks, switches the database to read-only mode, and
216 Visual Warehouse & DB2 OLAP Server
creates a list of files that need to be backed up. This command is issued
through ESSCMD, for example:
BEGINARCHIVE TBC Expanded Expanded.lst
Copy the files specified, using standard backup utilities. The files to be
backed up ideally should be the \Essbase\app\database directory, which
includes the outline, Load Rules, calculation scripts, and reports. This then
provides a backup of the cube control information that is not stored in DB2
tables.
At the completion of this process for all cubes, stop DB2 OLAP Server and
perform an offline backup of the DB2 database or the specific complete set of
tablespaces. With an offline backup, restoration to the previous backup
without needing to go to archive logs is provided as an easy point-in-time
recovery model.
If the requirement is only to copy specific cubes, ensure that there are no
connections to that application’s database, and perform an offline backup.
Following this, when DB2 OLAP Server is started again, issue the following
command to bring the cubes back online in read/write mode for users:
ENDARCHIVE
Read-only mode from the BEGINARCHIVE command persists until an
ENDARCHIVE is performed, which puts the database into read/write mode
and reopens database files in exclusive mode.
In addition to using archive, the cube, or a portion of the cube, Level 0 blocks
only, or Input blocks only, can be exported to an ASCII file, which can be used
as an alternative backup method. This ASCII file can also be used to load the
data into another cube.
8.5 Calculation
Ensure that a calculation is always performed after the load has completed.
For a regular incremental load, a specific calculation script may be more
appropriate than the default calculation script. From a design perspective, as
discussed in 9.8.5, “Time As a Sparse Dimension” on page 241, if the Time
dimension is defined as sparse, only a portion of the total cube is required to
be calculated. If the Time dimension is defined as dense, all the data blocks
need to be calculated.
Ongoing Maintenance and Updates of the Cube 217
Check with DB2’s REORGCHK utility to determine whether a DB2 REORG is
required. Ideally, after the calculation, REORG and RUNSTATS of the key
and fact table should be run. With DB2, you can set up a script to update the
RUNSTATS information in DB2 for specific tables, if the RUNSTATS
information has been captured from a cube that is in steady state production
status, rather than explicitly running RUNSTATS each time.
8.6 Partitions
If partitioned models are used for cubes, a key factor in ensuring consistency
across the cubes is to perform the calculations in the required order.
For example, for transparent partitions, where we have TBC_East as the
source cube and TBC_West as the target cube, we should always perform a
calculation on the TBC_East source cube first, followed immediately by the
calculation on the TBC_West target cube. This approach ensures that parent
members in the target cube accurately reflect the aggregation of the data
values from the source cube’s children.
Synchronization of outlines also becomes a key factor if outlines change
often. Ideally, we would have a centrally administered dimension cube, which
includes all the corporate dimension hierarchies and is used as a common
repository for all outlines in the corporation. This is especially required if
partitioned cubes are implemented.
218 Visual Warehouse & DB2 OLAP Server
Chapter 9. Performance
The very first lesson to learn in tuning a DB2 OLAP Server cube is that
performance is highly dependent on the input data and the hierarchy
structure. The more we know about the characteristics of the data being
loaded, the better our cube design and performance will be. Therefore, what
may work in tuning one cube will provide minimal enhancements to another.
Thus questions that would be used in optimizing a relational database design
are also used to optimize a cube design.
Note
Performance of the cube is highly dependent on the model, on the dense
and sparse dimension settings, and the characteristics of the data loaded.
There are many different tuning areas for a DB2 OLAP Server
implementation, depending on what the performance bottleneck is. Among
some of the areas we can tune are the operating system configuration, DB2
OLAP Server configuration, DB2 configuration, hardware configuration, cube
design, and DB2 physical table placement.
For a DB2 OLAP Server implementation, there are always trade-offs in
performance tuning. For example, we may find that placing a particular
dimension as the relational anchor is very good for performance if this is a
total refresh of the cube each month.
However, if we regularly add data to the cube, the performance of adding
data each month would be very slow, if there are changes in the relational
anchor dimension, as a complete restructure would be required each month.
Thus the requirement of how the cube will be managed over time plays a very
important role in the cube design.
It is important to tune the design of both the cube and the database. Two
simple examples show how tuning can significantly affect the performance of
the load and calculation processes of the cube.
Starting with the initial TBC sales model introduced in 4.1, “Introduction to the
TBC Sales Model” on page 41, we performed a load and calculate of the
database. We defined all the dimensions as dense. During the calculation,
DB2 OLAP Server used a large amount of virtual memory. We then changed
the dense/sparse settings and added an order-by clause to the input SQL
statement. The load of the data took approximately the same time.
© Copyright IBM Corp. 1998 219
However, the time to calculate the cube was reduced by 60%, a very
significant difference in time. The other major improvement was the amount
of virtual memory required to calculate, which was reduced by 90%.
Another test we performed was the load of the expanded TBC sales model
with 4,000 records with no order-by clause in the SQL statement. After
altering the SQL statement to include order by Customer, Year, Scenario,
Sales, Quantity, and Cost, we cleared the data and reloaded. The elapsed
time for the load was reduced by 80%.
So, as you can see, even with very simple tuning steps in the cube design
and in the SQL statement or launch tables, a significant improvement can be
made.
Before we proceed let us note that we do not exhaustively cover
performance, which would take a redbook in its own right. Rather, we point
out some areas that may be of benefit. Review the DB2 OLAP Server
manuals if you plan to do extensive tuning.
9.1 Tuning Process
For DB2 OLAP Server implementations, the best method of tuning is to use a
subset of the real data, which is representative of the total, and go through
the process of loading and calculating the data, monitoring throughout the
process. Then review the figures, change the outline, change settings, and
load and calculate again.
Although this may seem cumbersome, as mentioned earlier cube
performance is highly dependent on the model, on the dense and sparse
dimension settings, and on the characteristics of the data loaded.
9.2 Block Sizes
So how can you do some planning up front? Based on estimates and
expected data, you should be able to estimate some of the block size and
quantity information.
DB2 OLAP server uses two types of internal structures to access data: data
blocks and indexes. DB2 OLAP Server creates a data block for each unique
combination of sparse dimension members, provided that at least one data
value exists for the sparse dimension member combination. The data block
represents all the dense dimension members for that specific combination of
sparse dimension members.
220 Visual Warehouse & DB2 OLAP Server
DB2 OLAP Server creates an index entry for each data block. The index
represents the combinations of sparse dimension members. It contains an
entry for each unique combination of sparse dimension members for which at
least one data value exists.
Label members, shared members, and Dynamic Calc members that are not
stored do not use storage space and thus are not considered in the following
calculations:
Potential Data Block Size = ( ( multiply the number of members in each dense
dimension) * ( 8 bytes per cell) )
Potential Number of Data Blocks = ( multiply the number of members in each
sparse dimension)
Index Size = ( Number of Data Blocks ) * 56
We recommend for performance that data blocks be between 8 KB and 64
KB; block sizes closer to 64 KB are more optimal.
9.3 Review of Block Sizes for the TBC Sales Model
We now review the block sizes for both the initial TBC sales model and a
model modified for performance. Understanding block sizes can assist in
capacity planning and performance tuning.
To take a simple example, we tune the original TBC sales model as shown in
Figure 156.
+
Figure 156. Original TBC Sales Outline
Performance 221
All dimensions have been, by default, defined as dense dimensions, and DB2
OLAP Server has chosen Product as the relational anchor dimension. The
number of members in each dimension are: Measures (6), Product (22),
Market (142), and Time (17). This can be determined from viewing the
application event log after a restructure, in the Declared Dimension Sizes
message as shown in Figure 159.
Alternatively, you can retrieve this information can be retrieved by reviewing
the Statistics panel, which can be reached by selecting Database =>
Information => Statistics from the Application Manager window (Figure 157
and Figure 158).
Note that there are two distinct values, members in dimension and members
stored, with Product having 22 members including 19 stored members. The
other values are Measures with 6 members including 3 stored, Market with
142 members including 138 stored, and Year with 17 members including 17
stored.
Figure 157. Reviewing the Database Information
222 Visual Warehouse & DB2 OLAP Server
Figure 158. Reviewing the Database Information (continued)
Given the above, you can then review the DB2 OLAP Server event viewer to
provide block sizes. The block size is also provided in Figure 158, as
1,069,776 bytes.
Figure 159. Event Log Showing Block Sizes for Original Outline
As you can see in Figure 159, a number of different values are provided for
block sizes:
Performance 223
• The logical block size 178,296. The logical block size is based on the
actual number of members present in the data, so the actual block size is
4 * 19 * 138 * 17 = 178,296. Note that the actual number of members is the
maximum number of members minus the number of shared members and
Dynamic Calc members.
• The maximum declared block size is 318,648. Because all dimensions
were defined as dense, the maximum declared block size is 6 * 22 * 142 *
17 = 318,648.
• The maximum actual possible blocks is one data block with a block size of
133,722. In our example, Measures is defined as Label Only, so no data is
stored for it. Thus, the calculation becomes 3 * 19 * 138 * 17 = 133,722.
• The maximum declared blocks and actual blocks is one, so the whole
cube will be one block in size.
Thus, to load and calculate the simple TBC sales cube, DB2 OLAP Server
must bring the whole block into 1 MB of memory.
This whole topic on block sizes has implications for the amount of virtual
memory that is required to calculate the cube. With this initial setup, when we
calculated the cube, it took a large amount of virtual memory at the server.
Now let us change some settings within the definition of the TBC sales model
and look at the block size values again. Note that we changed only the DB2
OLAP Server design and parameters, without tuning DB2 itself!
We made the following changes to the model:
• Market defined as the relational anchor dimension
• Product defined as a sparse dimension
• Ordered the input data by the sparse dimension, Product
Figure 160 shows the new outline.
224 Visual Warehouse & DB2 OLAP Server
Figure 160. Final TBC Sales Outline
The event log shows the values after the changes (see Figure 161):
Figure 161. Event Log Showing Block Sizes for Final Outline
As you can see in Figure 161, the block sizes now have different values:
• The logical block size is 9,384. The logical block size, which is based on
the actual number of members present in the data, is 4 * 138 * 17 = 9,384.
• The maximum declared block size is 14,484; that is, the dense dimensions
multiplied (6 * 142 * 17 = 14,484).
Performance 225
• The maximum actual possible blocks of 19 data blocks with a block size of
7,038. This is again due to the Label only on the Measures member. Thus,
the calculation becomes 3 * 138 * 17 = 7,038.
• Maximum declared blocks is 22.
The load of the data took approximately the same time as in the first example.
However, the time to calculate the cube was reduced by 60%, a significant
difference in time. The other major improvement was the amount of virtual
memory required to calculate which, was reduced by 90%.
So, as you can see, even with very simple tuning steps, a significant
improvement can be made. Although a number of changes were made at the
same time, those that had the most impact were the dense and sparse
settings and the ordering of the input data.
In summary, then, in the initial TBC sales model, all dimensions were defined
as dense, so there was only one data block of actual size 133,722, and no
index. In the modified model, with Product defined as a sparse dimension,
there was 19 data blocks of actual size 7,038. The index size in this case was
19 * 56 = 1,064.
Thus, to load and calculate the modified cube, DB2 OLAP Server has to
bring, at a minimum, one block of 56 KB into memory, as compared to the
initial cube with, at a minimum, one block of approximately 1 MB, a factor of
19 times more memory.
Although we have shown a simple cube here, you can easily see that the
same principles apply, no matter what the size of the cube. So, the correct
setting of dense and sparse dimensions is of utmost importance in tuning the
cube.
Within the event viewer, the data block size is provided in number of cells,
whereas in the Database Information Statistics, the data block size is
provided in bytes. Thus, multiply event viewer information by 8 to get the
bytes in Statistics.
9.4 Review of the Number of Stored Dimension Members
There is a very interesting point when we review what DB2 OLAP Server
sees as the number of stored dimension members, as compared to the
number of distinct input combinations in the data input when we review the
expanded TBC sales cube (Figure 162).
226 Visual Warehouse & DB2 OLAP Server
Figure 162. Expanded TBC Sales Outline
If we want to review the number of stored dimension members, we can use
the DB2 OLAP Server command line interface, by entering:
EssCmdW.
The process is to log in to the server, enter the user ID and password, select
the application, and select the database (Figure 163). From there, we can
perform a variety of commands, which are available by entering a question
mark (?) at the command line.
Performance 227
Figure 163. Reviewing the Stored Members Using the Command Line
Some of the main commands are GETDBSTATE, GETDBINFO, and
GETDBSTATS (Figure 164).
228 Visual Warehouse & DB2 OLAP Server
Figure 164. Reviewing the Stored Members Using the Command Line (continued
Figure 165 shows the GETDBSTATS information.
Figure 165. Showing the Database Statistics for the Expanded Cube
Performance 229
If we review the Market dimension, we have loaded 514 rows of member
combinations, when we review the distinct combinations of Market in the
input data.
However, DB2 OLAP Server (Figure 165) says there are 656 members in the
dimension and 598 members stored, a difference of 58. At first glance, this
seems very odd. However, when we review the data, 55 cities have only 1
zipcode, and 3 states have only 1 city associated with them, which is a total
of 58.
If there is only one member in a hierarchy, it is considered an implicit shared
member, and thus DB2 OLAP Server does not need to store the data again.
This provides for less storage, as can be explicitly seen in the Market
dimension in this example.
Additionally, with Market having 598 stored members in total, this is 84 stored
members more than in our input file. When we review the hierarchies in our
input file, we can see that there are 4 regions, 34 states (37 states - 3
shared), 46 cities (100 cities - 54 shared cities), for a total of 84 stored
members more than our input data.
What this implies is that, in estimating the size of the cube for capacity
planning purposes, ensure that the hierarchy members are also included. As
you can see from this example, there is an extra 15% members stored in just
the one dimension, with a potential of 27% extra, if there were no implicit
shared members.
9.5 DB2 OLAP Server Parameters
The design of the outline as seen by the two examples above can affect
performance greatly in terms of both load times and the amount of memory
required. However, as in any implementation, care should be taken not to
overcommit memory, as it will result in paging and detrimental performance.
Note
The information provided in this section is specific to IBM DB2 OLAP
Server. It is not applicable to Hyperion Essbase.
9.5.1 Memory
Each database or cube has its own specific cache and memory areas, such
as the index cache, the data cache, and the calculator cache. The DB2
230 Visual Warehouse & DB2 OLAP Server
memory requirements are dependent on how many cubes are stored and
used in any specific DB2 database.
A general rule of thumb that can be used as a starting point for estimating
memory requirements is to allocate the index cache size, so the whole index
is kept in memory, allocate two to three times the index cache size to be the
data cache size, and allocate the DB2 bufferpool space to be three times the
index cache size.
Overall, the load data workload, the calculation workload, and the user query
workload should be profiled for each active database in determining memory
and virtual storage requirements.
9.5.2 DB2-Specific Parameters
We first review DB2-specific tuning parameters. Because the data is stored in
DB2 tables, DB2 itself has to be tuned, using standard DB2 tuning
techniques: for example, ensure that the tablespaces are defined as
database managed tablespaces (DMS), ideally use raw I/O, configure disks
so that logs are on a separate device, and ensure that DB2NTNOCACHE is
specified.
Looking at specific DB2 OLAP Server DB2 tuning, during the main processes
of loading data and calculating data, the two main DB2 tables that are used
are the fact table and the key table.
As part of the RSM.CFG file, we can specify that, for each database in an
application, the fact table and index are to occupy their own tablespace and
indexspace. The other specification is, that, for the application, the key table
must be placed, along with the other DB2 OLAP Server tables, in a specific
tablespace and indexspace.
From a performance and backup perspective, for the fact table, ensure that it
is in a separate DMS tablespace and that the index has its own tablespace.
Ideally, allocate the fact tablespace to its own bufferpool, so tuning can be
specific to the cube with the use of BUFFPAGE.
For the fact table, an index is created based on the sparse dimensions. If we
ensure that the outline is ordered so that dense dimensions are defined first,
followed by sparse dimensions in increasing order of member numbers, the
index will be created based on the sparse dimensions from the bottom up as
it was defined at the initial saving of the outline.
Thus, with our outline of Product/Market/Scenario/Measures/Year/Customer,
the index is created for Customer/Year/Measures/Scenario. Thus the index is
Performance 231
optimal in terms of largest cardinality in the first column of the index, down to
the least cardinality in the last column of the index.
If we also ensure that the data is loaded in ordered sequence by
Customer/Year/Scenario, the index is also optimally loaded.
Other DB2 parameters that should be reviewed, though not necessarily from
a performance perspective, are the LOGBUFSZ, LOGFILSIZ, and
LOGSECOND for setting the log file sizes and number of files appropriately
for the cube sizes and commit block frequencies; LOCKTIMEOUT, to ensure
that users are actually timed out rather than waiting for another user to
commit; and APPLHEAPSZ, for the statement sizes.
9.5.3 Cache Size Tuning
DB2 OLAP Server has internal caches for the data and for the index. In
addition, for calculation, it has a calculator cache. Whenever a data block is
requested, the index is read to obtain the required block number, then the
data block is read. If the data block is not in the data cache, disk I/O occurs to
bring it in. Similarly, if the index key is not in memory, the index page needs
to be brought in with disk I/O, both of which will mean longer elapsed times.
Thus, the tuning of the index and data caches is important for tuning of I/O
times.
9.5.3.1 Index Cache Size Tuning
If we first review the index cache size, which is the most important cache to
tune for optimal performance, the ideal situation for index cache size is to
have the entire index in memory. The default index cache size is 1,024 KB.
In the case of the expanded TBC sales cube shown in Figure 168, maximum
index size = ( ( multiply the number of members in each sparse dimension) *
56 ) = 4 * 6 * 382 * 1,004 * 56 = 9,204,692 * 56 = 515 MB. This number is very
large. However, the actual index size is based on the number of actual data
blocks * 56.
Thus, for our data for the expanded TBC sales model, we have 1,578 distinct
combinations being loaded for the sparse dimensions and 4,137 upper-level
blocks, so 5,715 * 56 = 320 KB. Thus the default index cache size of 1,024
KB holds all our index in memory. This can be reviewed by looking at
Database => Information => Runtime => Key Cache Hit Rate, which in our
case is 1.
If we want to perform estimates on setting the index cache size before
actually creating the cube, we could calculate the actual index size based on
232 Visual Warehouse & DB2 OLAP Server
the number of combinations we have in our cube or input data, plus the
number of upper-level sparse member combinations.
9.5.3.2 Data Cache Size Tuning
The data cache size has a default of 3,072 KB and is the area used to read
and process data blocks. Increasing this data cache size will allow more data
blocks to be kept in memory. In general, start with three times the index
cache size and review how many data blocks can be kept in memory, using
Database Information Statistics.
The calculation process is discussed in detail in 9.2, “Block Sizes” on page
220 and 9.3, “Review of Block Sizes for the TBC Sales Model” on page 221.
From a performance perspective, for calculation of a Level 0 block, one data
block is brought into memory for a Level 0 sparse member combination, and
the dense dimensions are calculated within the one data block.
For upper-level blocks, where the sparse member combination includes one
parent member, each of the associated level 0 blocks has to be read into
memory and aggregated into the upper-level block. For example, to calculate
the upper-level block for Qtr1 97 and Customer 0000001380 would require
reading three data blocks, for Jan 97/Customer 0000001380, Feb
97/Customer 0000001380, and Mar 97/Customer 0000001380.
For upper-level blocks where the sparse member combination includes more
than one parent member, multiple aggregation paths are possible. For
example, to calculate Qtr 1 97/Institution/Margin, we could either read in
Jan,Feb,Mar/Institution/Margin, Qtr1 97/all Institution Customers/Margin, or
Qtr1 97/Institution/Sales,COGS.
If we review these instances, in the first case, we would need to read 3 data
blocks; in the second, over 300 data blocks; and in the third, 2 data blocks.
DB2 OLAP Server, when using the default calculation or the CALC ALL
option, chooses the last calculation dimension to be aggregated. With the
CALC ALL, this is the last dimension calculated and so the data blocks are
likely to be in memory. In general, this seems to perform well. However, there
may be instances when calculation will be more optimal in setting a
customized calculation order by the use of a calculation scripts.
As you can see in this process, the data blocks are read into the data cache
area, and the upper-level block is also read into and aggregated in the data
cache area.
Performance 233
Note that all the aggregations are done such that we have an aggregated
upper-level block, and then as a second step, any calculations required at
that upper level are performed. For example, it makes no sense to aggregate
percentage values. The percentage can only be performed when the data
from the lower-level blocks has been aggregated, so that the percentage is
correct. (See also 6.6, “Two-Pass Calculations” on page 149.)
If the data blocks requested are already in memory in the data cache, disk I/O
is not required, thus reducing the calculation time. Thus, the data cache
should be sized and tuned based on the model, the number of data blocks
required to be read in, and the amount of memory available.
Always remember though that we want to maximize the amount of active
index in memory before tuning for data cache size and that we do not want to
overcommit memory.
9.5.3.3 Calculator Cache Size Tuning
So far, we have not mentioned the calculator cache. In reviewing the
calculator cache, the cache is actually a bit map for all possible data blocks,
with each bit indicating that the corresponding data block actually exists. In
other words, the sparse member combination has dense data associated with
it. It can be viewed as an index to the index cache.
For example, for our expanded TBC sales model, we have a maximum
number of blocks of over nine million, as calculated earlier (see Figure 165 on
page 229). The calculator cache has a bit set on or off specifying the
existence of each of these blocks. The actual number of data blocks, as seen
earlier, is actually 5,715 existing data blocks, so the default setting of 200 KB
is a lot more than required for our specific example. However, a review
should be performed.
Note that the calculator cache can be set to three discrete levels in the
ESSBASE.CFG file, with CALCCACHEHIGH, CALCCACHEDEFAULT, and
CALCCACHELOW. In the calculation script, we can then specify which
setting to use:
set calc high;
Note that CALCCACHEDEFAULT is the default value, as expected.
234 Visual Warehouse & DB2 OLAP Server
9.6 The Relational Anchor Dimension and Performance
Part of the process of building a cube in DB2 OLAP Server is the
identification of a relational anchor dimension, which is expanded in the fact
table. The anchor dimension must be defined as a dense dimension.
Because the anchor dimension is the base for the fact table, the total number
of members in a relational anchor dimension must be less than the maximum
number of columns able to be defined per table in the database, which in DB2
UDB V5 is 500 columns.
In our case, we chose Product as the relational anchor dimension. Product
has 61 members. As discussed earlier, we need to choose a dimension that
will have minimal changes to the members, while at the same time optimizing
the addition of data to the cube. If we placed Year as a dense dimension and
allocated it as the anchor dimension, then at beginning of the year only very
few columns would contain data.
Because a data block contains combinations of the dense dimension
members, a block can consist of one or more rows. In our case, which is quite
specific, a customer is only ever in one market, and thus we have one row per
data block.
To explain this, Product and Market were defined as dense, and Measures,
Scenario, Year, and Customer were defined as sparse. Because a data block
exists for every sparse member combination, thus for COGS/ Actual/Jan
97/Customer 0000001380, based on our data, a customer is only in one
market (see Figure 166).
Sparse Dimensions Product (rel.anchor)
Measures Scenario Year Customer Market 100-10 100-20 ... 400-30
COGS Actual Jan 97 00001380 Denver 2340 1760 ... 1890
COGS Actual Feb 97 00001380 Denver 3100 1850 ... 1720
Data Block 1
Data Block 2
Figure 166. Fact Table with Product As Relational Anchor Dimension
Performance 235
If we had chosen Market as the anchor dimension (if it had fewer members),
we would have multiple rows per block. For this case, Customer 0000001380
has purchased products from product groups of Cola, Root Beer, Cream
Soda, Fruit Drinks, and Diet Drinks, all of which would be on a separate row,
and each of the underlying products purchased on a separate row, with only
one cell filled in the Market anchor dimension per row (see Figure 167).
Sparse Dimensions Market (rel.anchor)
Measures Scenario Year Customer Product Phoenix Denver ... SanDiego
COGS Actual Jan 97 00001380 100-10 - 2340 ... -
COGS Actual Jan 97 00001380 100-20 - 1760 ... -
COGS Actual Jan 97 00001380 ... - ... ... -
COGS Actual Jan 97 00001380 400-30 - 1890 ... -
Data Block 1
Figure 167. Fact Table with Market As Relational Anchor Dimension
Because DB2 OLAP Server processes one block at a time, the fewer rows
per block, the less effort is required to retrieve them from the database, and
thus the better the performance. Thus, in general, choose the dense
dimension that has the most members that can fit into a row, then confirm
that the dimension will rarely change, if ever, as a change in the anchor
dimension members may require a restructure.
Note that if we had defined Measures as dense and chosen it as the anchor
dimension, it would be the best dimension for loading of data and likely the
dimension that changes the least. However, the number of members in
Measures is typically small, so we would have many more rows per block to
be read during user queries.
Note
End of the section specific to IBM DB2 OLAP Server.
9.7 Tuning the Data Load
Loading of data can be significantly improved by a number of steps. We
review the expanded TBC sales model, which is shown in Figure 168.
236 Visual Warehouse & DB2 OLAP Server
Figure 168. The Expanded TBC Sales Model
1. Check the dense and sparse dimensions as discussed above.
2. Determine the relational anchor dimension as discussed above.
3. Ensure the outline is ordered so that dense dimensions are at the top,
followed by the sparse dimensions (Figure 169).
4. Within the sparse dimensions area, ensure that the sparse dimensions are
sequenced from smaller dimensions to larger dimensions.
5. Ensure the selected column names follow the outline dimensions.
6. Add an order-by clause to the SQL statement, ordering by sparse
dimensions, from the largest sparse dimension to the smallest sparse
dimension (Figure 170).
Figure 169. Final Expanded TBC Sales Model
Performance 237
Figure 170. Final SQL for Data Load
With the outline and SQL changes of 3, 4, 5 and 6, we were able to reduce
the load times for the expanded TBC sales model by a factor of 8!
9.8 Tuning the Calculation
Tuning of the calculation process is a very different matter. Calculation is, in
essence, a CPU-constrained process, as it is a single-threaded process. The
calculation process is covered in great detail in Essbase Database
Administration Guide, SC26-9238.
Note
From a performance perspective, there are two distinct areas: performance
of aggregations and, as a quite separate topic, performance of formulas.
Anyone who has significant numbers of formulas in their cubes, as
compared to straight aggregations, should review the material in Essbase
Database Administration Guide, SC26-9238.
238 Visual Warehouse & DB2 OLAP Server
Note
In calculation performance, minimize the calculation area as much as
possible. Use specific calculation scripts to specify the cube area required
to be calculated, rather than using the default CALC ALL.
Calculation scripts are discussed in more detail in Chapter 6, “A Closer Look
at Calculating the OLAP Database” on page 135.
Specifying the cube area to be calculated minimizes the number of data
blocks and data cells to be read, aggregated, and have formulas applied to.
Also, always remember that the basic premise to start with for reviewing
calculations is that the DB2 OLAP Server calculation engine calculates every
block everywhere unless explicitly limited; that is, if Intelligent Calculation is
on, the calculation is limited to those blocks that are dirty. If you specify in a
calculation script to calculate an area, the calculation is limited to that area.
9.8.1 Reviewing the Status of a Running Calculation
You can review the calculation process, using the SET MSG SUMMARY,
SET MSG DETAIL, and SET NOTICE commands within a calculation script to
write calculation progress to the application event log. These commands can
be set only before the calculation script starts, so if there is a running
calculation, there is little that can be done to review how far it has processed
without these commands.
You can also change the formulas in the default calculation script. For our
expanded TBC sales model, we changed the default calculation script to be:
SET MSG SUMMARY;CALC ALL;
SET MSG DETAIL would also show you the calculation order of the data
blocks. This command is very useful for reviewing calculations. Note,
however, that it has a high overhead, so it should be used only during testing
and development.
Although there is a default calculator cache size, to tune for a specific cube,
set the calculator cache size specifically in the calculation script, using the
SET CACHE command. The calculator cache is discussed in 9.5, “DB2 OLAP
Server Parameters” on page 230.
Performance 239
9.8.2 Defining Members as Dynamic Calc and Store
Consider having a number of members defined as Dynamic Calc, or Dynamic
Calc and Store. With Dynamic Calc, the calculation is not performed as part
of the calculation process, but rather when the user accesses the data. If the
member is defined as Dynamic Calc and Store, the first user who accesses
that data will be penalized for the dynamic calculation and store of the data,
but future users of the data will have the data immediately accessible.
An interesting twist is that only those Dynamic Calc and Store members
requested are actually stored, with any intermediate Dynamic Calc and Store
members used for the calculation being discarded. While this may sound odd,
if the intention of Dynamic Calc and Store is to reduce the amount of space
for the cube, this is quite valid. However, it does mean additional overhead
when those intermediate members are actually requested.
In our expanded TBC sales example, Customer has 1,004 possible members,
which are in three hierarchies of Institution, Retail, and Wholesale. We
changed the definition of Customer, Institution, Retail, and Wholesale
members to be Dynamic Calc. The calculation time was reduced by 40%.
If there are many Dynamic Calc members in the cube, you should use the
spreadsheet with Navigate without data, until the spreadsheet design is in the
format required. Without using this option, each time the spreadsheet design
is changed, if the displayed member has Dynamic Calc activated, the data
will be calculated at that time, even though you may not be interested in that
member specifically.
Using Dynamic Calc can be a significant performance option for calculation
performance, after reviewing the user requirements and usual use of the
cube.
In general, for dense dimension members, the use of Dynamic Calc is
beneficial. Due to the fact that the block is already in memory, Dynamic Calc
should be relatively quick. If Dynamic Calc and Store is defined, space is
allocated in the data block for the member whether or not it is used, thus
increasing the block size and thus the overall database size.
For sparse dimension members, in general, Dynamic Calc is less frequently
used.
If upper-level sparse dimension members that have a large number of
children are tagged as Dynamic Calc, a large number of blocks potentially
need to be read on the fly to perform the calculation. Obviously, you would
not want this to happen in a production environment.
240 Visual Warehouse & DB2 OLAP Server
When a restructure occurs, or you start the application, DB2 OLAP Server
writes a message to the event log, indicating how many possible data blocks
will need to be read for the top dynamically calculated member. (See also
6.8.1, “Dynamic Calculation Considerations” on page 152.)
9.8.3 Time and Accounts Dimension Tags
If specific time or accounts features are not used, we recommend not using
the corresponding tags either. As soon as you have dimensions tagged as
Time and Accounts, there is a specific calculation process where multiple
calculation passes may be required, depending on sparse and dense settings
and formulas.
9.8.4 Formulas
From a formula perspective, simple formulas require less time to calculate
than complex calculations. For example, a complex formula references a
member from a different dimension, uses mathematical range functions, or
uses index or financial functions. If this complex formula is on a sparse
dimension, consider using a calculation script to calculate only the required
data blocks. Otherwise, DB2 OLAP Server checks each possible, not actual,
sparse member combination to see whether it needs calculating.
Ideally, place formulas on the outline, rather than in a calculation script, to
improve performance.
9.8.5 Time As a Sparse Dimension
When loading data in every month, consider making the Time dimension a
sparse dimension. Thus, new data being loaded is put into new data blocks,
so the previous month’s blocks are not changed. Because they are not
changed, DB2 OLAP Server does not mark them dirty and thus does not
require to recalculate them.
9.8.6 Large Database Outlines
When dealing with a dimension such as Customer, where there are likely to
be thousands of members and very few parents, Customer would then be
considered a flat dimension. When this occurs, see whether intermediate
levels can be introduced to create more parents in the dimension.
Additionally, use a calculation script with the SET CALCHASHTBL command,
to set a calculator hash table to be used.
Another setting to consider for optimizing formulas on sparse dimensions in
large database outlines is to turn on the bottom-up sparse formula calculation
Performance 241
method, using the SET FRMLBOTTOMUP calculation command with a
calculation script.
9.8.7 Intelligent Calculation
Intelligent Calculation is active by default with DB2 OLAP Server. When a
calculation is performed, the calculation engine reviews the blocks to see
which blocks are dirty and then calculates the dirty blocks. There may be
benefit in turning Intelligent Calculation checking off, for example, if you can
identify which cube area you need to calculate, or when you have updated
more than 80% of the cube’s data. Turning Intelligent Calculation off can be
done through a calculation script with the SET UPDATECALC command.
Note that this has other implications in terms of marking the blocks as clean
after the calculation has completed (see 6.7, “Intelligent Calculation” on page
150).
If you want to calculate the whole cube, irrespective of whether blocks are
marked clean or dirty, you must turn off Intelligent Calculation.
9.8.8 Cross-Dimensional Operators
From a performance perspective, cross-dimensional operators (->), which are
often used in calculation scripts to identify a specific area, should be used
with caution under some circumstances, such as on the left-hand side of an
equation and in equations on a dense dimension.
9.8.9 Running REORG and RUNSTATS
Ideally, after the data load and calculation has been completed, DB2 REORG
followed by RUNSTATS is performed on the fact and key tables. If all the
steps described in 9.7, “Tuning the Data Load” on page 236 have been
followed, the actual clustering should be reasonably good.
In our specific example, after loading three separate months of data in three
independent load and calculation processes, without performing any REORG
and RUNSTATS, the cluster ratios were 91% and 79% for the fact and key
tables, respectively. Thus, performance would definitely benefit from REORG
and RUNSTATS.
9.9 Block and Cell Calculation Order
If data is never loaded at parent levels, consider setting DB2 OLAP Server to
aggregate #MISSING values. By default, it does not aggregate #MISSSING
242 Visual Warehouse & DB2 OLAP Server
values. This can be set in a calculation script with the SET AGGMISSG
command, or in the Database => Settings panel.
How the setting for SET AGGMISSG impacts performance calculation is
provided in the scenario below.
We review at this point how DB2 OLAP Server calculates the data blocks and
cells. As discussed in 6.5, “Outline Calculations” on page 146, the Accounts
and Time dimensions are followed by other dense dimensions in the order in
which they appear in the outline, and then they are followed by sparse
dimensions in the order in which they appear in the outline.
For Dynamic Calc members, this calculation order is different. It is the sparse
dimensions first, followed by the dense dimensions. This is optimal. If we
consider Qtr1 97 and Product 100 as Dynamic Calc, with sparse dimensions
first, the data blocks for Jan, Feb, and Mar are read first, then a single
dynamic calculation is performed on Product 100. If dense dimensions were
first, there would have to be three independent dynamic calculations on Jan,
Feb, and Mar, before Qtr1 97 could be calculated. This would be a lot slower.
If we review the outline shown in Figure 171, where Year and Market are
sparse dimensions, and Product and Measures are dense dimensions, DB2
OLAP Server will calculate the Product dimension first, followed by the
Measures dimension.
Performance 243
Figure 171. Calculation Order for an Outline
Each possible combination of sparse members is given a block number,
based on the calculation order of members in the outline. If there is no actual
data loaded for the sparse combination, the block number is reserved for later
use, in case the data is loaded later. Note that block numbers are thus
renumbered in a number of circumstances such as when you move, add, or
delete a member in a sparse dimension.
Thus, each member of the Year dimension is calculated, while keeping the
Market member fixed, then the next cycle of the Year dimension is performed
for the next member in Market; that is, Jan 97/Aspen, Feb 97/Aspen, Mar 97/
Aspen, Qtr1 97/Aspen, Apr 97/Aspen ... Year 97/Aspen, Jan 97/Denver, Feb
97/Denver... and so on.
Within the dense data block itself, the cells are then calculated again
according to the outline order. Thus, Product dimension members are
calculated first followed by the Measures dimension members. So for our
example the calculation sequence would be 100/Quantity, 200/Quantity,
244 Visual Warehouse & DB2 OLAP Server
300/Quantity, 400/Quantity, Diet/Quantity, Product/Quantity...100/Sales,
200/Sales... and so on.
Note that when we get to cells in the data block where both members are
upper levels such as 100/Measures, calculation is possible either from the
Product or from the Measures calculation. In this case, if SET AGGMISSG is
off as the default, this cell value will be calculated twice, once for the Product
calculation, and once for the Measures calculation. To avoid this double
calculation, we can SET AGGMISSG on, and DB2 OLAP Server than
calculates the cell at the last calculation, which is Measures.
So far we have discussed the calculation order of Level 0 blocks. However,
another point of interest is how the upper level blocks are created. An
upper-level block is a dense data block that has at least one parent member
in the sparse member combination for the block. For example, Qtr1 97/Aspen
is an upper-level block, with Qtr1 97 being a parent member.
In this case, the upper level block is calculated based on the parent members
dimension, so that, in this case, Jan 97/Aspen, Feb 97/Aspen and Mar
97/Aspen are used to calculate the Qtr1 97/Aspen upper level block.
However, if the sparse member combination has more than one parent, DB2
OLAP Server chooses the parent member’s dimension that is last in the
calculation order. For example, Qtr1 97/Colorado has both Qtr 1 97 and
Colorado as parent members. Because Year occurs before Market in the
outline in sparse dimensions, the Market dimension will be used to
aggregate.
The choice of the calculation path may affect the calculation time for
multiparent sparse combinations. For example, Qtr1 97 has three children;
thus if Year is the chosen dimension, three data blocks of Jan 97/Colorado,
Feb 97/Colorado, and Mar 97/Colorado will need to be read. However, if
Market is the dimension chosen, three data blocks for Qtr1 97/Aspen, Qtr1
97/Denver, and Qtr 1 97/Grand Junction will be read. So the number of
blocks is the same in this specific case.
However, if we review Qtr1 97/New York, and Market is the chosen
dimension, eight data blocks rather than the Year’s three data blocks would
have to be read. Although this is a small case, there may be cases where, for
performance reasons, a calculation script may be more appropriate, in order
to specify the explicit calculation path.
Performance 245
9.10 Review of the Database Information for a Cube
DB2 OLAP Server provides both statistics and run-time information for the
cube (Database => Information => Statistics and Database =>
Information => Run-time), which can be useful in reviewing block size and
cache size performance.
As you can see in Figure 172, the anchor dimension is Product, which is
dimension 1. The data cache size allows for 17 blocks, and we had a
maximum of 17 blocks in the last load or calculation of data, with a block
cache hit rate of 18%. Thus, if we had more storage available, an option
would be to increase the data cache size.
Figure 172. Reviewing Run-Time Information for the Expanded TBC Cube
From Figure 173, you can see that the key cache hit rate is 1, the high water
number of keys cached is 7,389, and the key cache size will allow for 17,476
keys to be cached. Thus, the key cache size could actually be reduced if we
have a storage constraint.
246 Visual Warehouse & DB2 OLAP Server
Figure 173. Reviewing Run-Time Information for Expanded TBC Cube (continued)
In reviewing Figure 174, you can see that the number of values per row in the
fact table, in other words, for our anchor dimension of Product, is 58 columns.
The maximum number of rows per block in the fact table is 382. This number
is the Market dense dimension stored members.
Figure 174. Reviewing Run-Time Information for Expanded TBC Cube (continued)
Performance 247
9.11 Using Partitions to Improve Performance
One method of reducing the elapsed time for calculation is to design the cube
with transparent partitions. With transparent partitions, one partition can be
on the local server, and the other partition can be on another server.
Alternatively, with an SMP server, both partitions could be on the same
server, enabling use of at least two processors for the calculation. With this
environment, each cube can be loaded separately and calculated separately
and in parallel, thus reducing the elapsed time for load and calculation.
Another method, in cases where the main cube is very large, is to create an
addition cube where the new data for the current period (for example, month)
is loaded and calculated. When this smaller cube is up-to-date, all cells are
exported from the cube and loaded into the master cube. After the load, a
calculation script can be used to calculate only the Time dimension of the
master cube. This operation results in a completely updated master cube.
When a user queries the target cube, however, the queries will take longer as
the data needs to be retrieved not only from the target cube but also from the
source cubes that are a source for the partitions. If this is the case, there may
be reason to replicate the partition as discussed in Chapter 7, “Partitioning
Multidimensional Databases” on page 161.
From a user perspective, for performance, replicated partitions can be used
to move the cubes closer to the users to avoid network traffic. Alternatively, if
each department would like just its own departmental data, then a master
cube for the corporation can be replicated to multiple departmental cubes.
When you have decided on what type of partitioning design you will use,
there are a number of performance considerations to take into account,
specifically for transparent and replicated partitions.
Linked Partitions
Linked partitions are totally independent, so there is no specific partition
design performance considerations for linked partitions.
Transparent Partitions
Consider the following performance-related topics for transparent partitions:
• Partitioning along a dense dimension can produce very slow performance,
especially for calculation. Thus, ensure that transparent partitions are on a
sparse dimension.
248 Visual Warehouse & DB2 OLAP Server
• Consider using Dynamic Calc or Dynamic Calc and Store as parents of the
transparent data. Although this will reduce the batch calculation time, any
time a cell is accessed it will be dynamically calculated.
• For optimizing the calculation on the data target, consider a replicated
layer between the low-level transparent data and the high-level local data.
• Load data into the data source, rather than through the data target.
For the first point explicitly, in our example using a TBC_East partition and a
TBC_West partition, where TBC_West is our target partition and includes
East, with Market as a dense dimension, the calculation for TBC_West for a
relatively small cube appeared to be extremely slow.
We turned on SET NOTICE HIGH in the calculation script, so that when every
1% of possible blocks was calculated, a message was written. With this, we
could then see that there was actually work going on.
When we changed Market to be a sparse dimension, the calculation of the
TBC_West partition was reduced by 90%, a very dramatic difference.
Replicated Partitions
Consider the following performance-related topics for replicated partitions:
• Do not replicate members that are dynamically calculated.
• If there is derived data in the target partition, consider replicating the
lowest level of each dimension and performing a calculation at the target.
• Partition along a sparse dimension, not a dense dimension.
• Ensure that any target members that have Dynamic Calc or Dynamic Calc
and Store are not included in the partition area.
• Replicate only changed cells, not all cells, to ensure minimal traffic across
the network.
Performance 249
250 Visual Warehouse & DB2 OLAP Server
Chapter 10. Problem Determination
Because of the complexity of a Business Intelligence environment, in terms of
the many different components that work together, identifying problems
requires a well-structured process. In this chapter we give an overview of the
most important information sources for problem determination.
In general, if an error occurs when using DB2 UDB, Visual Warehouse,
and/or DB2 OLAP Server we have to find out exactly where the problem
originated. If we get an error message, we have to check all of the dependent
components, unless the message points directly to the source of the error.
10.1 Windows NT
In the Windows NT operating system environment, check that:
• Windows NT starts correctly, that is, there were no error or warning
messages during Windows NT startup.
• The user ID is known by Windows NT.
• The TCP/IP configuration parameters are correct.
• ODBC works correctly, that is:
• The ODBC version used is supported by DB2 UDB, Visual Warehouse,
and DB2 OLAP Server.
• Only one version of ODBC is currently available on the system.
The checks can be done by reviewing the following information:
Services
Select Start =>Settings =>Control Panel =>Services.
Check whether all services needed are actually started (for example, DB2,
DB2 administration server, DB2 security server, Visual Warehouse logger,
Visual Warehouse server, Visual Warehouse agent daemon)
Event Viewer
Select Start =>Programs =>Administrative Tools =>Event Viewer.
Select Log =>System, Log =>Security, and Log =>Application to check
whether any error occurred.
© Copyright IBM Corp. 1998 251
If an error or warning occurred, click the icon to show the detailed
information. The Windows NT event viewer is a good starting point for
figuring out which component caused the error.
User ID
On the server select Start =>Programs =>Administrative Tools =>User
Manager.
Check that the user ID to be used is defined and has the proper access
permissions.
Test Connection with Ping
Select Start =>Programs =>Command Prompt.
Type Ping Servername or Ping TCP/IP Address. to:
• Ping the database server.
• Ping the Visual Warehouse server (if it is different).
• Ping DB2 OLAP Server (if it is different).
• Ping the host system(s).
Network Configuration
Select Start =>Settings =>Control Panel =>Network.
Check that all entries are correct.
Note
You can find the corresponding names and/or addresses of these servers
in the network definition on the specific server, or ask the network
administrator.
ODBC Version
To check the ODBC version, select Start =>Settings =>Control Panel
=>ODBC =>ODBC Drivers.
10.2 DB2 Universal Database
For DB2 UDB check for:
• Successful start of DB2 services
• DB2 environment variable DB2COMM
• Database status and access privileges
252 Visual Warehouse & DB2 OLAP Server
• DB2 log for common database problems
• DB2 trace for special problems and detailed information
• DB2 client configuration (TCP/IP)
Check the host name and/or IP address of the remote server(s).
The connection port number for the client must be the same as the port
number that the SVCNAME parameter maps to in the services file at the
server. (The SVCNAME parameter is located in the database manager
configuration file on the DB2 server.) This value must not be in use by any
another applications, and it must be unique within the services file.
• DB2 database catalog definitions (especially for remote nodes and
databases)
The checks can be done by reviewing the following information:
DB2 Services
Select Start =>Settings =>Control Panel =>Services.
Check whether all services started correctly:
• DB2
• DB2DAS00
• Governor (only if used)
• JDBC applet server (if you use Java applets)
• DB2 security server (if you use an authentication server)
Event Viewer
Select Start =>Programs =>Administrative Tools =>Event Viewer.
Select Log =>System, Log =>Security, and Log =>Application to check
whether any error occurred.
If an error or warning occurred, click the icon to show the detailed
information. If an SQL return code is provided, details can be found in the
DB2 Information Center.
Select Start =>Programs =>DB2 for Windows NT =>Information Center.
Then select Troubleshooting to retrieve detailed information about DB2
error messages or warnings.
Problem Determination 253
DB2 Environment Variables
Select Start =>Settings =>Control Panel =>System, and select
Environment.
Check whether the required variables are present, for example,
DB2COMM=TCP/IP.
Database Status and Access Privileges
The access privileges of the user ID used for the connection can be
examined and/or changed in the following way:
Select Start =>Programs =>DB2 for Windows NT =>Administration Tools
=>Control Center.
Select the database in the tree view.
Click the right mouse button and select Authorities... to review or change
users and their database privileges.
To check whether a database is in a consistent state, the following steps are
necessary:
Select Start =>Programs =>DB2 for Windows NT =>Administration Tools
=>Control Center.
Select the database in the tree view.
Click the right mouse button and select Configure....
Select the Status panel.
Select the Database is consistent parameter.
The value of this parameter indicates whether the database its consistent or
not.
If the database is inconsistent, close the Configure Database window and
disconnect all users and applications.
Then select the database in the tree view.
Click the right mouse button and select Restart.
DB2 now creates a recovery job and runs it automatically. After the job ends,
the database should be in a consistent state.
254 Visual Warehouse & DB2 OLAP Server
DB2 Log
The DB2DIAG.LOG is, by default, located in directory x:\SQLLIB\DB2 (where
x is the installation drive letter). The file is a plain-text file that can be
reviewed with any text editor (for example, the Notepad application).
DB2 Trace
To start the trace, select Start =>Program =>DB2 for Windows NT
=>Problem Determination =>Trace. Before attempting to use the trace
utility, read the Help information by clicking the Help button.
DB2 Client Configuration (TCP/IP)
To check the connection port, open the Services file stored under
winnt\system32\drivers\etc, using a text editor (such as the Notepad
application). The Services file should contain the following entries:
db2cDB2 50000/tcp # Connection port for DB2 instance DB2
db2iDB2 50001/tcp # Interrupt port for DB2 instance DB2
where db2cDB2 is the connection service name, 50000 is the port number for
the connection port, tcp is the communication protocol, and the string starting
with # is a comment describing the entry.
To review the host names, open the Hosts file stored under
winnt\system32\drivers\etc, using a text editor. In the Hosts file the entries
have the following format:
100.100.100.100 serverhost # comment
where 100.100.100.100 is the IP address, serverhost is the host name, and the
string starting with # is a comment describing the entry.
DB2 Directory Definitions
The DB2 directory contains information about the remote nodes and
databases cataloged. To review this information, use the following
commands:
DB2 LIST NODE DIRECTORY
DB2 LIST DATABASE DIRECTORY
DB2 LIST DCS DIRECTORY
You can also review the information from the DB2 Control Center.
Problem Determination 255
10.3 Visual Warehouse
For the diagnosis of problems in the Visual Warehouse environment, the
following information is relevant and should be checked:
• Correct start of Visual Warehouse services
• Visual Warehouse control, source, and target databases are registered as
ODBC system DSN and the appropriate database access privileges for
the corresponding user ID exist.
• Visual Warehouse logs (basic logging, component trace, VWP and
Transformer logs, startup error logs)
• Correct user ID and password (case sensitive)
• Correct Visual Warehouse groups and privileges
• ODBC: The SQL statements you want to use work with the current ODBC
driver, and the current ODBC version is supported by Visual Warehouse.
The checks can be done by reviewing the following information:
Visual Warehouse Service
Select Start =>Settings =>Control Panel =>Services.
Check whether all services you need are started:
• Visual Warehouse server
• Visual Warehouse logger
• Visual Warehouse agent daemon(s)
If an error occurred, check the Event Viewer.
Event Viewer
Select Start =>Programs =>Administrative Tools =>Event Viewer.
Select Log =>System, Log =>Security, and Log =>Application to check
whether any error occurred.
Select Start =>Programs =>Visual Warehouse =>Messages and Reason
Codes to get detailed information about Visual Warehouse error codes
mentioned in the log files.
Visual Warehouse Control, Source, and Target Databases
Select Start =>Settings =>Control Panel =>ODBC, select System DSN,
and check whether the database is registered.
256 Visual Warehouse & DB2 OLAP Server
Visual Warehouse Basic Logging
Visual Warehouse basic logging is always active. The information is stored in
the Visual Warehouse control database and can be reviewed with the Visual
Warehouse log viewer.
Select Start =>Programs =>Visual Warehouse =>Visual Warehouse
=>Visual Warehouse Desktop.
The Visual Warehouse Logon screen shows up. Enter the administration user
ID and password and click OK.
Select Operations =>Log for build time information, or select Operations =>
Work in Progress, select a Business View, and click Log for run-time
information.
Note
The Visual Warehouse log and the Event Viewer are similar. The Visual
Warehouse log is more detailed, but you can find all important messages in
both.
Visual Warehouse Component Trace
Visual Warehouse component traces provide information about the Visual
Warehouse server (in IWH2SERV.LOG), Visual Warehouse logger (in
IWH2LOG.LOG), the Visual Warehouse control component (in
IWH2EOLE.LOG), and the Visual Warehouse agents (in AGNTnnnn.LOG).
The files are written to the directory specified in the VWS_LOGGING
environment variable.
Note
The Visual Warehouse component trace should be activated only when
requested by IBM Support to obtain additional information for Visual
Warehouse problem determination and problem source identification.
The information provided by this trace is of limited use in assisting in
user error identification.
Visual Warehouse Program and Transformer Error Logs
The supplied VWPs and Transformers also write error logs to the directory
specified in the VWS_LOGGING environment variable.
A log table can be specified for VWPs on the program page of the Business
View window.
Problem Determination 257
Startup Error Trace
Problems during the startup of Visual Warehouse components are written to
IWH2LOGC.LOG, IWH2LOG.LOG, or IWH2SERV.LOG in the directory
specified in the VWS_LOGGING environment variable.
User ID and Password
Select Start =>Programs =>Visual Warehouse =>Visual Warehouse
=>Visual Warehouse Desktop.
The Visual Warehouse Logon screen shows up. Enter the administration user
ID and password and click OK.
Select Security =>Users to check the available user IDs and their
corresponding security groups.
Visual Warehouse Groups and Privileges
Select Start =>Programs =>Visual Warehouse =>Visual Warehouse
=>Visual Warehouse Desktop.
The Visual Warehouse Logon screen shows up. Enter the administration user
ID and password and click OK.
Select Security =>Groups to check the security groups, the users mapped to
the groups, and their privileges for information sources, targets, and subjects.
ODBC
To activate an ODBC trace, select Start =>Settings =>Control Panel
=>ODBC =>Tracing.
Select the log file path, for example, C:\TEMP\sql.log.
Click Start Tracing Now.
To trace the ODBC calls to the control database, the trace must be started at
the Visual Warehouse Server site.
Note
All SQL commands to all databases using ODBC are traced. Therefore,
turn off the trace as soon as possible!
To stop ODBC tracing, select Start =>Settings =>Control Panel =>ODBC
=>Tracing.
Click Stop Tracing Now.
258 Visual Warehouse & DB2 OLAP Server
10.4 DB2 OLAP Server
In the DB2 OLAP Server environment, check that:
• DB2 OLAP Server is started, that is, the DB2 OLAP Server task is running.
• The individual tasks for the active applications are running.
• DB2 OLAP Server has a connection to the database.
Several log files document the current state and history of DB2 OLAP Server:
• Server log file
The server log file can be found in \ESSBASE\ESSBASE.LOG. It records all
server-related events common to all applications of that server. It can be
viewed externally from a text editor or from within the Application Manager
window (select Server=>View Event Log).
• Application log file
All application-related activities, including all data access events, are
recorded in an application log file dedicated to each application. The
application log file can be found at
\ESSBASE\APP\application_name\application_name.LOG within the DB2
OLAP Server directory structure. These log files show, for example,
problems related to accessing the DB2 database. SQL codes, SQL states,
and SQL error messages are indicated there.
The log files also contain pointers to additional exception error log files
that are created by the exception handler. The location of these files
depends on the component in error (for example, \ESSBASE\APP\
application_name\LOG00001.XCP).
• Trace file
On rare occasions, application log files may contain a pointer to a trace file
( .TRC). Trace files are written to a directory specified using the
ARBORDUMPPATH environment variable. If this environment variable is not set,
DB2 OLAP Server will not produce trace files.
• Outline change log file
To record changes in the outline of DB2 OLAP Server, set the
OUTLINECHANGELOG parameter to TRUE in the ESSBASE.CFG file. For a discussion
of the contents of this log file, refer to 8.2, “Changing the Outline with
Dynamic Dimension Build” on page 214.
For additional details about the diagnostic information available for DB2
OLAP Server or Essbase, see Chapter 45, "Monitoring Performance Using
Diagnostics," in the Essbase Database Administrator’s Guide Volume II,
SC26-9286.
Problem Determination 259
10.5 Other Components
A problem could also originate in one of the other components involved in the
Business Intelligence solution, such as:
• The intranet environment (including the IP network, the Web server, and
the firewall or IBM Net.Data)
• The front-end tool used for analysis (for example, Hyperion Wired for
OLAP or Cognos PowerPlay)
A description of all of the diagnostic information that these components
provide is beyond the scope of this book. Refer to the troubleshooting
sections of the corresponding product documentation to find out more about
what diagnostic information is provided and how you can access it.
260 Visual Warehouse & DB2 OLAP Server
Chapter 11. Security
Security is a critical and essential topic in a Business Intelligence
environment. It is often given short shrift, however, during the design and
build phase of the Business Intelligence solution, primarily because the
solution is intended to provide easy access to enterprisewide information and
the ability to freely combine and compare information from different lines of
business. Nevertheless, the information derived from the Business
Intelligence solution is a valuable asset that makes deficits in performance
and business processes visible and delivers competitive advantage. Thus
who has access to which information must be carefully considered.
Designing a comprehensive security solution can become quite complex,
especially if the Business Intelligence solution has to be integrated with the
company’s intranet or extranet environment.
Figure 175 provides an overview of the typical security layers along the
access path of a business user accessing the OLAP data mart environment
from a Web browser.
Web Web Analysis OLAP Database
Firewall Application
Browser Server Server Server
Server
Internet Security
Analysis OLAP Database
Application Server Security
Security Security
Figure 175. A Typical Business Intelligence Security Architecture
Considering the many components involved in the solution, it is very unlikely
that security can be managed from a single point of control. A security
strategy is needed to keep the administration and maintenance effort
considerably low, without risking unauthorized access to, or accidental loss
of, the information assets of the company.
The security strategy has to consider the following topics:
• Geography of the company
• Organizational structure of the company
• Number of business users
© Copyright IBM Corp. 1998 261
• Number of administrative users
• Number of external users (for example, suppliers, customers)
• Centralized or decentralized security management
• Information ownership
• Need to know
• Different access paths to the information
• Security capabilities of the software components used
11.1 Security Layers of the OLAP Data Mart
According to Figure 175, the following security layers can be identified, when
using, for example, DB2 UDB, Visual Warehouse, DB2 OLAP Server, and
Wired for OLAP Web clients:
The first security layer is the Internet security layer. In general the following
security options are available, depending on the Internet server, the firewall,
and the Web browser capabilities:
1. No security
2. User ID / password
3. Certificates
4. Combination of 2 and 3
Firewalls are used to isolate the external (insecure) network from the internal
network. They come in two primary flavors:
• Packet filters
• Application-layer gateways (or proxies)
Packet filters use an access control list to allow or deny packet delivery to
inside hosts based on source and destination addresses in the packet.
Application-layer gateways are also known as proxies. A proxy is a server
that is interposed between an internal client and an outside service when the
client attempts to make an outside connection. For example, an HTTP proxy
server accepts connections from internal clients that are directed toward the
outside host's port 80. It then makes the connection to the outside network
itself and relays the response to the client. This insulates the client and hides
the details about the internal network from the outside service.
262 Visual Warehouse & DB2 OLAP Server
A detailed description of all of the options and issues in the area of Internet
Security is beyond the scope of this book. To learn more about this topic,
refer to the Redbooks Internet Security in the Network Computing
Framework, SG24-5220, and OS/390 Security Server Enhancements,
SG24-5158.
The next security layer is the analysis application security layer. As an
example, we describe the security layer of Wired for OLAP. After the user has
linked to the Wired for OLAP Java application, he or she is required to log on.
The user identification for Wired is defined in the Wired Administrator - User
and Connection Management. There, users are mapped to the corresponding
DB2 OLAP Server databases to which they have access. Three policies for
connecting the user to a database are available:
1. Use Wired logon information for the database logon
2. Prompt the user
3. Define a fixed user ID and password to logon to the DB2 OLAP Server
database
We recommend using policy 3. With this option the number of user IDs that
have to be administered in the database environment can be kept to a
minimum necessary, and individual users are prevented from accessing the
database directly.
Additionally it is possible to predefine an entry point into the OLAP application
for each user within Wired for OLAP. Therefore the user can be connected
directly to a specific view or corporate report group after logging on.
The user ID specified within Wired to access the DB2 OLAP Server database
has to be defined to the next security layer, which is the DB2 OLAP Server or
Hyperion Essbase security layer. The security definitions are administered
from the Application Manager, where users can be defined and associated
with groups, DB2 OLAP Server applications, and databases. For applications
the type of access can be:
• None
• Access to databases
• Access to Application Designer
For databases the following access categories can be specified:
• No access
• Filter access (to control access on the cell level)
Security 263
• Read-only access
• Read/write access
• Access to calculation or calculation scripts
• Database Designer access
For a more detailed description of the security capabilities of DB2 OLAP
Server, refer to Chapters 16, 17, and 18 in the Essbase Database
Administrator’s Guide, SC26-9238.
The next security layer, the database security layer of DB2, is reached when
DB2 OLAP Server accesses the relational data store. DB2 OLAP Server
accesses the relational database with a single, predefined user ID.
For users who access the DB2 OLAP Server star-schema with SQL, access
is controlled by the full DB2 database security mechanism (SQL GRANT /
REVOKE privileges).
Remember that DB2 UDB only supports user IDs with a maximum length of 8
characters; the password can be up to 10 characters. Both are case
sensitive.
Note that most of the components rely on the user IDs also being defined to
the operating system.
For DB2 the user or group must be a member of any Windows NT local
domain (global domains are not supported by DB2).
For a detailed discussion of DB2 and Windows NT related security, refer to
the Redbook DB2 Meets Windows NT, SG24-4893.
11.2 Visual Warehouse Security
From a security perspective, three key Visual Warehouse concepts must be
understood:
• Access to Visual Warehouse functionality requires Visual Warehouse user
IDs and groups, which can be defined in Visual Warehouse only. Visual
Warehouse privileges are then granted to Visual Warehouse groups.
There are no Visual Warehouse privileges granted explicitly to users.
• Access to information sources, target databases, and flat files is through
user IDs and passwords as defined in Visual Warehouse for the specific
databases and flat files. These user IDs and passwords must be defined
264 Visual Warehouse & DB2 OLAP Server
inside Visual Warehouse and outside as operating system user IDs and
passwords at their respective sites.
• Access by users to perform analysis and reporting on the Visual
Warehouse target warehouse is defined outside Visual Warehouse
security, by DB2 security. These user IDs and passwords must be defined
as operating system user IDs and passwords at the target warehouse site
and be granted DB2 connect and select access to the specific target
database.
Some additional security considerations for Visual Warehouse:
• There is no concept of an overall administrator privilege allowing direct
access to all Visual Warehouse functions and objects. For easy access to
all Visual Warehouse functions, create a Visual Warehouse administration
group, ADMIN_GROUP, with all Visual Warehouse privileges, and ensure
that this group has access to each information resource and target
warehouse. This will enable the monitoring and access of most Visual
Warehouse functions. However, with this access, the group still cannot
change Business Views whose Update Security Group is not
ADMIN_GROUP. Therefore, create a super-administrator, SUPERADM,
who should be a member of all groups.
• Visual Warehouse privileges are systemwide. The following privileges are
available:
• Administration privilege - Can add and delete users from the system
and assign users to security groups
• Resource definition privilege - Can create information resources and
make them available to others
• Business View definition privilege - Can create Business Views
• Business View maintenance privilege - Can maintain Business Views
• Operations privilege - Can perform any function in the Operations
menu, such as viewing the log or work in progress.
Security 265
266 Visual Warehouse & DB2 OLAP Server
Part 3. Accessing an OLAP Data Mart
In part 3 of this book, we consider various options for accessing information
in an OLAP data mart. It is essential to enable navigation and analysis, as
well as visualization, presentation, and publication of the analysis results in
an intuitive and easy to understand way. Often a single access tool cannot
provide a solution that fits all of the access requirements of a diverse end-
user community, which can range from customers, executives, management,
and business analysts to database administrators, and other IT specialists.
In the chapters in this part we examine OLAP analysis options, ranging from
familiar spreadsheets, to SQL access to the DB2 OLAP Server relational
star-schema, and to Web-based OLAP analysis using Java technology with
Hyperion Wired for OLAP.
© Copyright IBM Corp. 1998 267
268 Visual Warehouse & DB2 OLAP Server
Chapter 12. OLAP Analysis Using the Spreadsheet Add-in
One of the key methods of accessing data in DB2 OLAP Server is through a
spreadsheet interface, such as Lotus 1-2-3 or Microsoft Excel. In this chapter,
we show how to use the DB2 OLAP Server Spreadsheet Add-in for Excel. We
assume that the Spreadsheet Add-in has been installed.
As soon as the cube is loaded and calculated, you can access it by following
these steps:
1. Connect to the DB2 OLAP Server.
2. Select the DB2 OLAP Server application and database required.
3. Double-click on any cell in the spreadsheet, to retrieve the top level of the
cube.
4. From there, drill up and drill down in any of the dimensions.
5. If required, update cells in the spreadsheet and send the updated data to
the cube, followed by a calculate, and reretrieve the updated data.
6. Disconnect from the DB2 OLAP Server.
As you can see from the steps above, you can get up and running very
quickly.
From the Microsoft Excel menu bar, select Essbase => Connect (Figure
176).
© Copyright IBM Corp. 1998 269
Figure 176. Opening the DB2 OLAP Server Add-in
First you have to connect to the DB2 OLAP Server. Select the Server name,
enter a corresponding Username and Password, and click OK (Figure 177).
270 Visual Warehouse & DB2 OLAP Server
Figure 177. Connecting to the DB2 OLAP Server
You now have to select the explicit application and database that you want to
analyze. Select TBC Expanded => OK (Figure 178).
Figure 178. Selecting the Required Application and Spreadsheet
OLAP Analysis Using the Spreadsheet Add-in 271
We now have a blank spreadsheet in front of us. DB2 OLAP Server does not
automatically display the cube.
Select any cell in the spreadsheet and double-click with the left mouse button
to bring up the top level of the cube, with Scenario, Measures, Product,
Market, Year, and Customer (Figure 179).
Figure 179. Result of Accessing the Cube
To drill into the cube, specifically the Market dimension, select the Market cell
and double-click on it with the left mouse button. As you can see in Figure
180, this action pivots the Market dimension and provides the next level of
detail.
272 Visual Warehouse & DB2 OLAP Server
Figure 180. Drilling Down into the Cube
You can use the same technique for any of the dimensions, including the
Measures dimension. Each time you drill down on a new dimension, the
dimension is pivoted to the vertical bar, unless there is only one dimension
left at the top of the spreadsheet. Any dimension you are not interested in
analyzing therefore stays at the top level, or generation one, of the hierarchy.
If you want to drill back up, select a lower-level cell and double-click on it with
the right mouse button. Thus, with just two simple mouse operations, you can
analyze a cube.
If you now highlight the Measures cell in Figure 180, hold down the Alt key,
and double-click with the left mouse button (Figure 181), instead of pivoting
the dimension, this action puts the Measures hierarchy across the top of the
spreadsheet.
OLAP Analysis Using the Spreadsheet Add-in 273
Figure 181. Drilling Down into the Cube, Using the Alt Key
You can always cancel a DB2 OLAP Server retrieval, by using the Escape
key. This becomes important when analyzing large cubes or dimensions with
many members.
The DB2 OLAP Server Spreadsheet Add-in has many features and options
associated with it (Figure 176). For example, you can specify to use the
Retrieval Wizard to design how you want the dimensions to be placed on the
spreadsheet.
One useful feature is the ability to navigate without data. You can drill up and
down and set the dimensions and members exactly as you want them in the
spreadsheet, before actually retrieving any data. This feature can be
especially useful when you have dynamically calculated members in a cube
or when you have a large cube. Without this feature you would actually
retrieve data each time you drilled up and down, thus, potentially, dynamically
calculating a large number of members in which you are not interested.
Another useful feature of the Spreadsheet Add-in is Keep Only, which is of
great benefit again in a large cube. If you have drilled down but are now
interested in only analyzing a subset of the cube, rather than drilling back up,
you can highlight the members you want to retain in the spreadsheet, then
select Keep Only, and the other members will be removed from the
spreadsheet. Remove Only is the counterpart of this feature.
The Cascade feature allows automatic creation or cascading of multiple
spreadsheets from a single spreadsheet, based on a chosen member, with
274 Visual Warehouse & DB2 OLAP Server
each spreadsheet based on a child of that member. For example, if you have
a Market spreadsheet designed to show critical corporate information, you
can highlight Market and cascade the spreadsheet to produce the same
spreadsheet for each of East, West, Central, and South. These new
spreadsheets can then be analyzed by the individual market region areas.
Select Essbase => Options to modify your settings (Figure 182).
Figure 182. Setting the DB2 OLAP Server Spreadsheet Add-in Options
Some of the more useful settings in the Options area include the ability to
suppress #MISSING rows and Zero rows. Thus, you will only see data from
the cube where it exists. Alternatively, rather than removing #MISSING rows,
you can replace the #MISSING with another string such as N/A for not
applicable.
The Essbase => Options => Style panel enables you to set up different
highlighting for member attributes, such as parent members; dimensions; and
for data cells. One interesting style is the ability to identify any cells that have
a linked object such as a linked partition associated with them. You can also
highlight members that are defined as Dynamic Calc, or those members
containing a formula.
OLAP Analysis Using the Spreadsheet Add-in 275
In DB2 OLAP Server, you can actually update data in the cube. Typically, you
would use this feature when entering or updating, for example, budget
figures, where each department potentially enters and adjusts its own sales
forecasts and budgets on a regular basis.
In order for the update feature to be enabled, Database Setting =>
Database Access => Write or Calculate must be set. With Write, end users
can write to the database, as long as they have security access, and with
Calculate, end users can write and calculate the database as long as they
have security access.
Typically, we would not want an end user to calculate in the middle of the day.
We would want to control the environment. Thus, end user access to
calculate the database would be minimized.
For example, say you want to enter expected sales information for Kool Cola
for January 97 for the East market.
Drill down to the Measures members of Sales, COGS, and Margin for the cell
for Year/Jan 97, Scenario/Budget, Market/East, Product/Kool Cola, and
Customer/Customer. At this stage, you have no data at the Budget level, so
when you retrieve data, there are blanks as expected.
You have budgeted for Sales of 1000 and COGS of 300 for this specific
combination. Thus, enter for the Sales cell 1000 and for the COGS cell 300
(Figure 183). Do not specify any other Measures cells.
Now you have to lock the database (select Essbase => Lock), so that no
other user can change the values before you send the new values (see Figure
183).
276 Visual Warehouse & DB2 OLAP Server
Figure 183. Locking the Displayed Data before Updating the Cube
If you were not the only user able to change those data cells, then Retrieve &
Lock would be preferred, as you are then locking the data cells at the time of
retrieval. You then update the values, send the values to DB2 OLAP Server,
and then unlock those data cells. With Retrieve & Lock, other users can view
the data but cannot change it until you unlock the data cells.
Select Essbase => Send to send the new data values to DB2 OLAP Server
(Figure 184) to update the cube.
OLAP Analysis Using the Spreadsheet Add-in 277
Figure 184. Sending the Updates to the Cube
After you have updated the cube, the locks are automatically dropped if you
did not check Update Mode in Essbase Options. The default for Update
Mode is off.
We have now updated the cube, so we now need to calculate the cube. In this
case, we have security access to perform calculations. Select Essbase =>
Calculation... (see Figure 185).
278 Visual Warehouse & DB2 OLAP Server
Figure 185. Calculating the Cube
As mentioned earlier, use of this feature should be kept to a minimum during
operational hours, unless the impact will be minor.
If the cube contains Dynamic Calc and Store members, a spreadsheet update
of the data or of its dependent children will not mark any stored Dynamic Calc
and Store data as requiring recalculation. DB2 OLAP Server will recognize,
that a stored Dynamic Calc and Store member needs recalculating at the next
regular batch calculation, at a database restructure, or at the use of the
CLEARBLOCK DYNAMIC command and will mark the data block as requiring
recalculation. Then, at the next user-requested retrieval of data, the
requested data is dynamically calculated and stored, thus bringing the data
back into synchronization.
Select the calculation script required. In this case, you have only the default
script (see Figure 186). Click Calculate and OK.
OLAP Analysis Using the Spreadsheet Add-in 279
Figure 186. Selecting the Calculation Script for Calculating the Updated Cube
When the calculation has completed, a message is displayed (Figure 187).
Figure 187. Calculation of Updated Cube Completed
If you now select Essbase => Retrieve, you can see from Figure 188 that the
data values entered through Excel have been saved, and Margin has been
calculated based on these new values.
280 Visual Warehouse & DB2 OLAP Server
Figure 188. Retrieving the Updated and Calculated Data from the Updated Cube
Thus, as you can see, the process of using a spreadsheet to access and
analyze information in your DB2 OLAP Server cube is straightforward.
OLAP Analysis Using the Spreadsheet Add-in 281
282 Visual Warehouse & DB2 OLAP Server
Chapter 13. User-Defined Attributes
A user-defined attribute (UDA) is a word or phrase about a member in a
dimension hierarchy. UDAs can be used in the following situations:
• Calculation scripts: UDAs of a member can be queried within a calculation
script. Hence calculations can be performed on selective members
depending on the member’s UDAs, such as multiply all members with a
UDA of Debit by -1.
• Reporting objects: UDAs can also be used in reporting objects to format
certain columns of the report in a certain way. For example, you can list all
members with an attribute of Debit as negative values and all members
with an attribute of Credit as positive values.
• Data loading: On the basis of the UDA of a particular member, you can
modify the data values that are to be loaded into the multidimensional
cube. For example, if the UDA is Debit, you can multiply the input values
by -1 while loading.
13.1 Rules for User-Defined Attributes
The following rules apply for UDAs:
• Multiple UDAs can be defined for a member.
• UDAs assigned to a particular member must be unique.
• The same UDA can be assigned to several members.
• A UDA can be the same as a member name, alias name, level number, or
generation number.
• UDAs cannot be created for shared members.
• UDAs apply to a specific member only. Descendants of the member do not
inherit the UDAs.
13.2 Creating User-Defined Attributes
Figure 189 shows an outline for the TBC sales model. Here we define UDAs
for the members in the Market dimension.
© Copyright IBM Corp. 1998 283
Figure 189. The TBC Model
Highlight the East member in the Market hierarchy and click the Member
Specification button in the tool bar. Type Major Market in the Attribute box
and click the Add button (Figure 190). Click OK to close and exit. We will use
UDAs Major Market, Small Market, and New Market to identify the market
types for each of the members in the Market dimension.
Figure 190. Creating a UDA for Member East in the Market Hierarchy
284 Visual Warehouse & DB2 OLAP Server
Figure 191 shows the modified outline for the TBC sales model. Note that
Connecticut, which is a child of East, is defined as a Small Market although
parent member East is defined as a Major Market. So, the UDAs defined for
the parents and children can be totally different.
Figure 191. Modified TBC Model with UDAs
You have seen how to create UDAs manually. You can also create UDAs
while building the dimensions dynamically using Load Rules.
Create a new Load Rule for the TBC sales model. Specify the data source.
When the data is displayed, click the Viewing the dimension fields button in
the Data Prep Editor window. Assign Parent/Child relations for the Region
and State values (Figure 192). Highlight the column that contains the UDA for
the members.
User-Defined Attributes 285
Figure 192. Load Rule for Building the Market Dimension Dynamically
Click the Define Attributes for the selected column button in the Data Prep
Editor window. Select User-Defined Attribute in the Field Type list box.
Choose Market as the Dimension and 0 as the Number (Figure 193).
Note that while building the dimension dynamically, you cannot delete the
UDAs created earlier.
Figure 193. Defining the Field in the Input Data As a UDA
286 Visual Warehouse & DB2 OLAP Server
The Load Rule shown in Figure 194 will create UDAs for CHILD0 members,
not for PARENT0 members.
Figure 194. Load Rule for Creating a UDA
13.3 Using UDAs for Member Selection
Now, we will see how to use the UDAs to select only the required members
while accessing data from the Spreadsheet Add-in.
From the Spreadsheet Add-in, connect to the database. Click on Market and
then select Essbase=>Member Selection... (Figure 195).
User-Defined Attributes 287
Figure 195. How to Reach the Member Selection Panel
On the Member Selection panel, highlight Central and click the Member
Information... button (Figure 196).
288 Visual Warehouse & DB2 OLAP Server
Figure 196. Viewing Member Information from the Member Selection Panel
On the Member Information panel, the UDA for the selected member is
displayed (Figure 197).
Figure 197. Viewing UDAs from the Member Information Panel
User-Defined Attributes 289
Click OK to close and exit the Member Information panel. On the Member
Selection panel, highlight Central and click the Add-> button to add Central
to the Rules box. Select Central in the Rules box, click the right mouse
button and choose Subset... to select the required members.
We will now choose all members from the Central region which are not
marked as Small Market. On the Subset Dialog window, choose
User-defined Attribute from the list box. Check the NOT box. Choose Small
Market from the UDA list box and click the Add as AND Condition button as
shown in Figure 198. Click OK to close and exit.
Figure 198. Selecting a Subset of Members, Using the Subset Dialog
You can view the members that have been selected, using the condition
specified in the Subset Dialog window. Click the Preview... button on the
Member Selection panel and the list of members under Central satisfying the
condition specified in the Subset Dialog window are displayed (Figure 199).
Click OK to exit and close the Member Selection panel.
290 Visual Warehouse & DB2 OLAP Server
Figure 199. Preview of the Members Selected
Now, on the spreadsheet, you see only the members that were selected in the
Market dimension. Type Year next to each of the states that were selected so
that every state has a matching Year dimension associated with it in the
report (Figure 200). Select Essbase=>Retrieve to retrieve the data.
User-Defined Attributes 291
Figure 200. Associate Selected Members with the Year Dimension
13.4 Using UDAs during Data Load to Flip the Sign
During data load it is possible to flip the sign of the input data based on UDAs
before the data is loaded into the multidimensional cube.
Figure 201 shows the Data Load Settings panel. When loading data into the
Accounts dimension, for example, you can specify that when the UDA of an
Accounts member is Expense, flip the sign of the data from positive to
negative. Check the Sign Flip on UDA box and enter the UDA name, which
is Expense, and choose Measures for the Dimension.
292 Visual Warehouse & DB2 OLAP Server
Figure 201. Flip the Sign Based on UDAs
13.5 Using UDAs in Calculation Scripts
You can check whether a particular UDA exists for a member within a
calculation script, using the Boolean function @ISUDA. The arguments for
this function are the dimension name and the UDA. This function can be used
in an IF statement, for example, to determine whether the specified UDA
exists for the current member. You can use this method to limit the calculation
area for a particular formula or calculation.
User-Defined Attributes 293
294 Visual Warehouse & DB2 OLAP Server
Chapter 14. SQL Drill-Through
SQL Drill-Through provides intuitive data navigation from the
multidimensional cube into a relational database. Although the analysis that
is designed into OLAP applications does not apply to the transaction level
details, occasionally you might want to know about the details behind a set of
analytical data.
The transaction level details can be queried from either the data warehouse
or data mart environment, if it exists, or from a production system. Although
the detailed transaction level data is not part of the analytical data, SQL
Drill-Through enables you to query the detailed data as and when needed.
SQL Drill-Through in DB2 OLAP Server is designed to access the relational
data sources through an ODBC driver from the Spreadsheet Add-in and maps
data from a DB2 OLAP Server multidimensional database to a relational
database. SQL Drill-Through provides an interface to define the mappings
between the dimensional attributes of the multidimensional database and the
columns of relational tables. The complexity of the mapping is hidden from
the end user. As with the DB2 OLAP Server’s spreadsheet interface, the end
user does not need to know how to construct SQL queries to view relational
data. The DB2 OLAP database administrators have to predefine the data
mapping. When the end user navigates through data in the spreadsheet, the
dimensional attributes of the current data cell at the time of invoking SQL
Drill-Through determine the SQL statement that will be sent to the relational
database.
14.1 Installation Tips
The SQL Drill-Through Add-in of DB2 OLAP Server is available in a 16-bit
version as well as a 32-bit version. You have to install the version that
matches the Spreadsheet Add-in version you are using:
• If you are using the Spreadsheet Add-in for Excel 95 or Excel 97, you have
to install the 32-bit version of SQL Drill-Through because the Spreadsheet
Add-in for Excel 95 and Excel 97 are 32-bit versions.
• If you are using the Spreadsheet Add-in for Excel 5.0 or Lotus 1-2-3, you
have to install the 16-bit version of SQL Drill-Through because the
Spreadsheet Add-in for Excel 5.0 and Lotus 1-2-3 are 16-bit versions.
© Copyright IBM Corp. 1998 295
Note
The 16-bit and 32-bit versions of SQL Drill-Through do not necessarily
correspond to the versions of the operating system in use. For example, if
you are using Excel 5.0 under Windows 95, which is a 32-bit operating
system, you have to install the 16-bit version of SQL Drill-Through because
Excel 5.0 is a 16-bit product.
14.2 A Brief Description of the Architecture
SQL Drill-Through requests for accessing a remote databases can be routed
through the DB2 OLAP Server or sent directly from the client machine to the
remote databases through the ODBC driver.
14.2.1 Server SQL Drill-Through
When a user requests data from a remote database, using SQL
Drill-Through, the request is sent from the Windows client to DB2 OLAP
Server. DB2 OLAP Server then routes the request through the SQL interface
to the remote database. The relational database must be accessible to DB2
OLAP Server. Server SQL Drill-Through is the default.
14.2.2 Client SQL Drill-Through
The client machine must be connected to DB2 OLAP Server when SQL
Drill-Through is invoked. The drill-through request from the spreadsheet user
is directly sent to the remote database through the ODBC driver. The
relational database must be accessible to the Spreadsheet Add-in user.
Client SQL Drill-Through is faster than Server Drill-Through, but the ODBC
driver configuration has to be maintained on every client machine.
14.2.3 The Initialization File (SQLDRILL.INI)
The SQLDrillServer keyword in SQLDRILL.INI determines whether Server
Drill-Through or Client Drill-Through is in use. If the value of this keyword is
set to 1, the SQL Drill-Through requests are routed through DB2 OLAP
Server. The mappings between DB2 OLAP Server and the relational
databases, which are defined through the SQL Drill-Through user interface,
are also stored in this file.
296 Visual Warehouse & DB2 OLAP Server
14.3 Enabling SQL Drill-Through for the TBC Sales Model
We now go through the process of enabling the SQL Drill-Through option for
our TBC sales model (Figure 202). The detailed data for the TBC sales model
is stored in the target warehouse database, TBC_TGT, which is maintained by
Visual Warehouse Business Views. Using the SQL Drill-Through facility, we
now access the customer order details that correspond to a particular data
cell in the multidimensional cube.
Figure 202. The TBC Sales Model
To set up SQL Drill-Through, follow these steps:
1. Create a new SQL Drill-Through profile from the Spreadsheet Add-in.
2. Edit the profile and add SQL generation rules for the required dimensions.
3. Select the required tables from the relational database.
4. Select the required columns.
5. Define joins, if any, between tables.
6. Use the profile to drill-through.
From Excel, click Essbase=>Connect... and log on to the database,
supplying the corresponding Username and Password (Figure 203).
SQL Drill-Through 297
Figure 203. Connecting to TBC Application from Excel
Double-click on cell A1 in the spreadsheet to retrieve the data from the
multidimensional cube. Drill down on Market, Product, and Year dimensions
as shown in Figure 204.
298 Visual Warehouse & DB2 OLAP Server
Figure 204. Data Retrieved from the TBC Sales Model
Click on a data cell and then select Essbase=>SQL Drill-Through... to open
the SQL Database Login window (Figure 205).
Figure 205. SQL Database Login Window
SQL Drill-Through 299
Choose TBC_TGT as the Data Source from the list box and enter the DB2
OLAP Server name, Application name, Database name, Username, and
Password to connect to TBC_TGT. Click the Query Options... button to set
up the SQL Drill-Through profile.
On the Query Options panel, click the Add Profile button and add a new
profile with the name TBC_SIMPLE (Figure 206). Uncheck the Use current
profile values as default box. A profile is basically a definition of an SQL
statement to access data from the relational database. The profile editor
allows you to visually build the SQL with conditional where clauses.
Figure 206. Creating a New SQL Drill-Through Profile
Click the Edit Profile... button. On the SQL Generator tab of the Profile
Editor panel, you are going to set up SQL generation rules for each of the
dimensions in the TBC sales model. SQL generation rules are the mappings
between the dimensional attributes of a multidimensional cube and the
columns of a relational database.
Now, we set up SQL generation rules for the Market dimension. Choose the
Market dimension from the SQL Rules for list box. For each dimension, you
can specify a minimum generation number that the end user must view before
SQL Drill-Through is allowed. As shown in Figure 202, the Market dimension
has four generations, namely, Market, Region, State, and City. Let us assume
that we want the end users to drill-through only when they are viewing State
or City level data. So, check the User must attain at least generation box
and enter generation number 3 as shown in Figure 207. If the end user
attempts to drill-through before attaining generation 3, you can choose one of
the following options:
300 Visual Warehouse & DB2 OLAP Server
• Don’t Generate Where Clause: This option prevents an SQL where
clause from being generated for the given dimension, unless the end user
attains the generation number specified. Although a where clause is not
generated for the current dimension, an SQL statement can be executed
based on other dimensions in the database.
• Don’t allow SQL Drill-Through: This option prevents the SQL query from
being sent to the specified data source, unless the end user attains the
specified generation.
Choose the Don’t allow SQL Drill-Through option for the Market dimension.
If the end user attempts a drill-through before attaining generation 3, an error
message will be displayed, and the drill-through will not occur.
Because we are allowing drill-through only for State and City level data in the
Market dimension, we need to define a rule to map these values to the
database columns.
Figure 207. Profile Editor Window
Click the New... button on the Profile Editor window. On the SQL Rule Editor
panel, you have three options to parse the Market attribute of the selected
data cell:
• The Matches option matches the dimension data to a specific value.
SQL Drill-Through 301
• The Matches Pattern option matches the dimension data with a pattern
string. The pattern string can contain any specific characters as well as
wild-card characters * and ? (* matches any string, whereas ? matches
any character).
• The Does not Match Pattern option excludes dimension data that
matches the pattern string specified.
The Insert matched pattern option inserts a numbered string (\1) in the
where clause text box. This numbered string is replaced with the member
name of the current dimension when the query runs.
Because we do not have any specific string or pattern for State and City
values, we choose Matches Pattern and an * as the pattern string. Now, the
string could be a State or City value, which is available in the
ALL_CUSTOMERS table (the target table for Business View All Customers)
in the TBC_TGT database. So, type in the following in the where clause box
as shown in Figure 210:
(IWH.ALL_CUSTOMERS.STATE = ’\1’ OR IWH.ALL_CUSTOMERS.CITY = ’\1’)
Figure 208and Figure 209 show sample data from the relational tables used
in this example.
Figure 208. Sample Data from ALL_CUSTOMERS Table
Figure 209. Sample Data from HISTORY_OF_ORDERS Table
When the SQL statement is triggered, the specified where clause will be
added to the SQL for the Market dimension (see Figure 210).
302 Visual Warehouse & DB2 OLAP Server
Figure 210. Creating SQL Generation Rule for Market Dimension
For the Product dimension, we should allow SQL Drill-Through only when
generation 2 is attained. So, set the User must attain at least generation
value to 2. This means that the dimension value at the time of drill-through
can be product group code, product class code, or Diet. We will set up three
different rules to handle these values (Figure 211).
Figure 211. SQL Generation Rules for the Product Dimension
SQL Drill-Through 303
The Product attributes of the data cell in Excel are to be matched with the
PROD_CODE column in the HISTORY_OF_ORDERS table, which is the
target table for the Business View History of Orders in our Visual Warehouse
data mart.
The first rule is used when the active data cell has a product group code
attribute, such as 200. So, when the product attribute matches pattern string
???, the following where clause should be generated:
IWH.HISTORY_OF_ORDERS.PROD_CODE LIKE '\1%'
The second rule is used when the data cell has a product class code attribute
such as 100-10. So, when the dimensional attribute matches pattern string
???-??, the following where clause should be generated:
IWH.HISTORY_OF_ORDERS.PROD_CODE LIKE '\1-\2%'
The third rule is used when the product attribute of the data cell matches the
word Diet. Only products with a class code of 20 are part of the Diet family.
So, the following where clause should be used:
IWH.HISTORY_OF_ORDERS.PROD_CODE LIKE '%-20-%'
Note that when multiple rules are defined for a single dimension, the rules are
validated from the top to the bottom, and the first rule to be successfully
matched is used to generate the where clause.
For the Year dimension, SQL Drill-Through should be allowed for generations
2 and higher. We have to match the Quarter and Month values with the
ORDER_DATE column in the HISTORY_OF_ORDERS table. So, we define
the following where clauses (Figure 212):
When the dimension value matches string Qtr1 97 :
MONTH(IWH.HISTORY_OF_ORDERS.ORDER_DATE) BETWEEN 1 and 3
When it matches Qtr2 97:
MONTH(IWH.HISTORY_OF_ORDERS.ORDER_DATE) BETWEEN 4 and 6
When it matches Qtr3 97:
MONTH(IWH.HISTORY_OF_ORDERS.ORDER_DATE) BETWEEN 7 and 9
When it matches Qtr4 97:
MONTH(IWH.HISTORY_OF_ORDERS.ORDER_DATE) BETWEEN 10 and 12
When it matches pattern string ???:
304 Visual Warehouse & DB2 OLAP Server
SUBSTR(MONTHNAME(IWH.HISTORY_OF_ORDERS.ORDER_DATE),1,3) = '\1'
We have now defined the SQL generation rules for all three dimensions in the
TBC sales model. Next, we have to define the columns that are to be
displayed when a drill-through is requested.
Click the Defined Columns tab of the Profile Editor window as shown in
Figure 213. All tables with a qualifier that matches the active user ID are
automatically displayed in the Tables in Database box. To add tables other
than those shown, double-click on [Define Table] and specify a fully qualified
name of the table to be added. Define IWH.HISTORY_OF_ORDERS and
IWH.ALL_CUSTOMERS to the list of Tables in profile.
Figure 212. SQL Generation Rules for the Year Dimension
SQL Drill-Through 305
Figure 213. Defined Columns Page of the Profile Editor
Click on IWH.HISTORY_OF_ORDERS in the Tables in Profile box (Figure
214). The columns in the tables are automatically displayed in the Columns in
selected Table box. Select all columns from the HISTORY_OF_ORDERS
table and add them to the Defined Column List box by clicking the Copy->
button for each column. Similarly, click on the ALL_CUSTOMERS table and
add CITY to the defined columns.
306 Visual Warehouse & DB2 OLAP Server
Figure 214. Defining the Column List for SQL Drill-Through
Now, CUSTOMER_CODE in the HISTORY_OF_ORDERS table should be
joined with CUSTOMER_CODE in the ALL_CUSTOMERS table to get the
city and state of the customers. Click the Table Link... button (see Figure
214) to define the join.
Select HISTORY_OF_ORDERS from the Table 1 list box (Figure 215). The
columns in the table are displayed in the Columns box. Choose
ALL_CUSTOMERS from the Table 2 list box. Click on the
CUSTOMER_CODE column in both tables and click the Add Link button.
Click OK to exit and close.
SQL Drill-Through 307
Figure 215. Defining Table Links for SQL Drill-Through
We have now completed setting up a profile enabling SQL Drill-Through from
our TBC sales model to the relational detail data in TBC_TGT. This profile will
be stored in the SQLDRILL.INI file.
Now switch back to the spreadsheet environment. Click the Edit SQL...
button on the SQL Database Login window to see the SQL query that has
been generated for the active data cell (Figure 216).
Figure 216. Viewing the Generated SQL
The generated SQL statement is shown with the numbered strings replaced
by actual values from the active data cell (Figure 217).
308 Visual Warehouse & DB2 OLAP Server
Figure 217. Generated SQL Query for SQL Drill-Through
The generated SQL can be modified, if required. Click OK to continue. Now,
click the Output Options... button and enter 100 in the Limit Output rows to
first box and click OK as shown in Figure 218.
Figure 218. Setting the Output Options
Now, click the Drill... button in the SQL Database Login window. The SQL
query is sent to the data source, and the query output is displayed in a
spreadsheet called SQLDATA.XLS (Figure 219).
SQL Drill-Through 309
Figure 219. SQL Drill-Through Output for the TBC Database
Close this sheet to go back to TBC sales multidimensional analysis.
Note
1. Whenever SQL Drill-Through is invoked, enter the Username and
Password on the SQL Database Login window, even if the previous
values are shown.
2. When SQL Drill-Through results in an error message such as "SQL
driver for data source is in use already and does not allow multiple
connections. Please try later," go to the DB2 Command Line Processor
and do a LIST APPLICATIONS to find the application handle of the
appropriate connection.Then do a FORCE APPLICATION <application
handle> to terminate the active connection. Disconnect from the DB2
OLAP Server application and start over.
310 Visual Warehouse & DB2 OLAP Server
14.4 Using a Hierarchy Table for SQL Drill-Through
The problem with the drill-through setup explained above is that a separate
where clause has to be coded for each level in the dimension hierarchy. One
way to overcome this problem is to create a table that has parent/child
relationships for the members so that a single where clause can be used for
all levels in the dimension hierarchy.
For example, the Product dimension in the TBC sales model has two levels,
namely, Product Group (for example, 100) and Product Class (for example,
100-10). The HISTORY_OF_ORDERS table where the detailed data is stored
has Product Size Codes (like 100-10-01). So, if you have a table that has
parent/child relationships for all products in the entire data warehouse, you
could then join the Product attribute of the current data cell to the parent/child
table to get all of the Product Size Codes associated with the product and
then join those Product Codes to the Product Codes in the
HISTORY_OF_ORDERS table. This would then work for any level in the
Product hierarchy.
SQL Drill-Through 311
312 Visual Warehouse & DB2 OLAP Server
Chapter 15. Using SQL to Access the DB2 OLAP Server Data Store
Note
The information provided in this chapter is specific to IBM DB2 OLAP
Server. It is not applicable to Hyperion Essbase.
15.1 DB2 OLAP Server Storage
When we create applications and databases in DB2 OLAP Server, a
relational cube or a star-schema is created in the relational database that
contains a shadow of the database outline and the actual data for the cube.
All applications and databases are also cataloged in DB2 OLAP Server
system catalog tables. In addition, DB2 OLAP Server creates a number of
views, which provide access to the multidimensional data as well as the
database outline details from standard SQL query tools or custom
applications.
DB2 OLAP Server uses the parameters specified in the RSM section, the
Application section, and the Database section of the RSM.CFG file to
determine how and where the components of the OLAP application should be
stored. These parameters can be used to control various things, such as in
which database the relational cube will be created, using a separate
tablespace and indexspace for the fact table, partitioning the fact table, and
isolation level for the database. For a detailed description of these
parameters, refer to the Using DB2 OLAP Server, SC26-9235.
15.2 DB2 OLAP Server Tables
The first time an application is created in DB2 OLAP Server, a catalog table
called CUBECATALOG is created in the relational database. The catalog
table has an entry for each of the multidimensional databases created. It
stores the application name, cube name, a unique identifier for the cube, and
the view names that can be used to query the multidimensional cube. Table
16 lists the tables that are created in the relational database when a
© Copyright IBM Corp. 1998 313
multidimensional database is created with DB2 OLAP Server. These tables
are created in the username schema.
Table 16. Tables Created When a DB2 OLAP Database Is Created
Table Description
Cube Contains a list of dimensions in a relational cube and
information about each dimension
Alias ID Contains a mapping of DB2 OLAP alias table names
to ID numbers allocated by DB2 OLAP Server.
Key Created when a first successful restructuring is done
on the outline. This table is the equivalent of the
Essbase Index.
Fact The data for the relational cube is stored in fact tables.
There will be one fact table for each relational cube.
LRO Contains one row for each link reporting object
associated with data cells in the relational cube
Dimension Contains detailed information about members in each
of the dimensions in an outline. There is one
dimension table per dimension.
User-defined attribute Contains member IDs and user-defined attribute
names for each of the members with user-defined
attributes specified in the outline. If multiple
user-defined attributes are specified for a member,
multiple rows are created in this table. There is one
user-defined attribute table per dimension.
Generation Contains generation numbers and names for each of
the generations specified in the outline. There is one
row per generation, and there is one table for each
dimension in an outline.
Level Contains level numbers and names for each of the
levels specified in the outline. There is one row per
level, and there is one table for each dimension in an
outline.
Figure 220 on page 315 depicts the relationships among the tables that
constitute a relational cube.
314 Visual Warehouse & DB2 OLAP Server
Fact Table
of CUBEn
Star View
(CUBEnFACT)
1 row per unique combination
of sparse members Dimension Table
(CUBEnDIMm)
1 row per member
Key Table
(CUBEnKEYA)
1 row per existing data block
Generation Table
(CUBEnGENm)
1 row per unique generation
Cube Catalog Cube Table
Table (CUBEn)
(CUBECATALOG)
1 row per dimension
1 row per application/cube Level Table
(CUBEnLEVm)
1 row per unique level
Alias ID Table
(CUBEnALIASID)
1 row per alias table
UDA Table
(CUBEnUDAm)
1 row per member/UDA combination
LRO Table
(CUBEnLRO)
1 row per linked reporting object
Figure 220. DB2 OLAP Server Table Schema
15.3 Views Created by DB2 OLAP Server for SQL Access
DB2 OLAP Server views provide easy access to the multidimensional data
stored in the form of a star-schema in the relational database from standard
SQL query tools or other SQL applications. The views are created in the
username schema and are managed by DB2 OLAP Server.
Using SQL to Access the DB2 OLAP Server Data Store 315
DB2 OLAP Server creates a cube catalog view, which corresponds to the
cube catalog table and contains one row per relational cube. For every
relational cube, the following views are created:
• Cube view, which contains dimension information for all dimensions in the
outline
• Dimension views for every dimension in the outline
• User-defined attribute view for every dimension in the outline
• Fact view, which contains the actual data for the relational cube with
corresponding member IDs
• Star view, which contains the actual data for the relational cube with
corresponding member names
• Alias ID view, which contains the alias table names for the DB2 OLAP
Server database
• LRO view, which contains linked reporting object information for the
relational cube.
15.3.1 Querying the Cube Catalog
The CUBECATALOGVIEW contains the DB2 OLAP Server application name,
cube name, and fully qualified names of all views, except the dimension
views and the UDA views, that are created for a relational cube. Table 17
shows the structure of the cube catalog view.
Table 17. Structure of the Cube Catalog View
Name Type Max Contents
Size
APPNAME VARCHAR 8 Name of the DB2 OLAP Server
application
CUBENAME VARCHAR 8 Name of the DB2 OLAP Server
database
LROVIEWNAME VARCHAR 27 Fully qualified name of the LRO
view
CUBEVIEWNAME VARCHAR 27 Fully qualified name of the cube
view
FACTVIEWNAME VARCHAR 27 Fully qualified name of the fact view
STARVIEWNAME VARCHAR 27 Fully qualified name of the star view
ALIASIDVIEWNAME VARCHAR 27 Fully qualified name of the alias ID
view
316 Visual Warehouse & DB2 OLAP Server
Figure 221 on page 317 shows some examples of useful SQL statements that
can be executed against the cube catalog view:
• Get a list of DB2 OLAP Server applications
• Get a list of DB2 OLAP Server cubes for a specific application
• Get the corresponding view names for a specific cube
Note that DB2 OLAP Server follows certain naming conventions for the views.
For the DB2 OLAP Server application name and database name, DB2 OLAP
Server chooses a unique string, TBC_SIMP, followed by @ and the view
type.
Figure 221. SQL Queries on the Cube Catalog View
Using SQL to Access the DB2 OLAP Server Data Store 317
15.3.2 Querying the Cube View
There is one cube view for every relational cube managed by DB2 OLAP
Server. The cube view contains one row for each dimension defined in the
outline. It contains the dimension view names and the UDA view names for
each dimension.
Table 18 shows the structure of the cube view.
Table 18. Structure of the Cube View
Name Type Max Contents
Size
DIMENSIONNAME VARCHAR 80 Dimension name as defined in
the outline
RELDIMENSIONNAME VARCHAR 18 A unique, short name for the
dimension that is assigned by
DB2 OLAP Server. This name
is used by DB2 OLAP Server as
the column name for non-
anchor dimensions while
creating the star view and the
fact view. DB2 OLAP Server
generates names that satisfy
the rules for valid column
names for the relational
database.
DIMENSIONTYPE SMALLINT Dimension type flag
0 - Dense Dimension
1 - Sparse Dimension
2 - Anchor Dimension
DIMENSIONTAG SMALLINT Dimension tag flag
0x00 - No tag
0x01 - Accounts
0x02 - Time
0x04 - Currency
0x08 - Currency Partition
DIMENSIONID INTEGER A unique ID for the dimension
DIMENSIONVIEWNAME VARCHAR 27 Fully qualified name of the
dimension view
UDAVIEWNAME VARCHAR 27 Fully qualified name of the UDA
view
318 Visual Warehouse & DB2 OLAP Server
Figure 222 on page 319 shows some examples of queries that can be
executed against this view:
• Get the dimension names, the dimension view names, and the UDA view
names for a specific cube
• Get the dimension name of the anchor dimension
• Get the column names that DB2 OLAP Server will use for the nonanchor
dimensions in the star and fact views
Note that the dimension view names and the UDA view names follow certain
naming conventions. The dimension view names have a unique string for the
DB2 OLAP Server application name and database name: TBC_SIMP,
followed by @, followed by the RELDIMENSIONNAME of the dimension. The
UDA view names also have similar names except that the @ symbol is
replaced with @@.
Figure 222. SQL Queries on the Cube View
15.3.3 Querying the Dimension Views
There is one dimension view for every dimension in the relational cube. The
dimension view contains one row for each member in the dimension defined
Using SQL to Access the DB2 OLAP Server Data Store 319
in the outline. It contains the member names and IDs, parent information,
level and generation information within the hierarchy, storage status of the
member, calculation information, and so on. Table 19 shows the structure of
the dimension view.
Table 19. Structure of the Dimension View
Name Type Size Contents
MEMBERNAME VARCHAR 80 Member name as defined in
the outline
RELMEMBERNAME VARCHAR 18 A unique, short name for the
member that is assigned by
DB2 OLAP Server. This name
is used by DB2 OLAP Server
as column name for the anchor
dimension while creating the
star view and the fact view, if
the dimension that the member
belongs to is defined as an
anchor dimension. DB2 OLAP
Server generates names that
satisfy the rules for valid
column names for the relational
database.
RELMEMBERID INTEGER A unique ID assigned by DB2
OLAP Server for the member.
This ID is used to join the
dimension tables with the fact
table to create the star view.
PARENTRELID INTEGER The RELMEMBERID of the
member’s parent in the outline.
This value is NULL for the top-
level member.
LEFTSIBLINGRELID INTEGER The RELMEMBERID of the
member’s left sibling in the
outline. This value is null if the
member does not have a left
sibling.
320 Visual Warehouse & DB2 OLAP Server
Name Type Size Contents
STATUS INTEGER Status contains a combination
of the following values:
0x0000 - Reserved
0x0001 - Never Share member
0x0002 - Label only member
0x0004 - Shared member
0x0008 - Reserved
0x0010 - Implicit Shared
member
(An Implicit shared member is
a parent member with only one
child or a parent member with
only one child that has an
aggregation operator among
other children.)
0x0020 - Dynamic Calc And
Store member
0x0040 - Dynamic Calc
member
0x0080 - Reserved
0x0100 - Reserved
0x02000 - Parent member
where one of its children is
shared
0x040000 - A regular member
CALCEQUATION LONG 32700 Default calculation equation for
VARCHAR the member. Note that this
equation can be overridden by
calculations specified in the
calculation scripts. So, this
formula is not necessarily the
one used to calculate the
member.
UNARYSYMBOL SMALLINT The values are:
0 - Add
1 - Subtract
2 - Multiply
3 - Divide
4 - Percent
5 - No op
Using SQL to Access the DB2 OLAP Server Data Store 321
Name Type Size Contents
ACCOUNTSTYPE INTEGER Used only when the member
belongs to an Accounts
dimension.
The values are:
0x0000 - No masking on zero
and missing values
0x4000 - Mask on missing
values
0x8000 - Mask on zero values
0x0001 - Balance First
0x0002 - Balance Last
0x0004 - Percent
0x0008 - Average
0x0010 - Unit
0x0020 - Details only
0x0040 - Expense
NOCURRENCYCONV SMALLINT Currency conversion flag.
0x0000-Use currency
conversion
0x0001-No currency
conversion
CURRENCYMEMBER VARCHAR 80 Member name in the currency
NAME cube, which is associated with
this member
GENERATIONNUMBER INTEGER Generation number of the
member
GENERATIONNAME VARCHAR 80 Generation name of the
member
LEVELNUMBER INTEGER Level number of the member
LEVELNAME VARCHAR 80 Level name of the member
ALIASTABLENAME VARCHAR 80 The alias for the member in an
alias table used in the outline.
There will be more than one
ALIASTABLENAME column if
more than one alias table is
used in the outline. This value
is NULL if alias tables are not
used.
322 Visual Warehouse & DB2 OLAP Server
Figure 223 on page 323 shows some interesting queries that can be run
against a dimension view:
• Get a list of members and their parents for a specific dimension
• Get a list of the lowest level members (level 0) of a specific dimension
Figure 223. SQL Queries on Dimension Views
15.3.4 Querying Fact View and Star View
DB2 OLAP Server creates and maintains two views on the fact table of the
relational cube, namely, fact view and star view.
The fact table that DB2 OLAP Server creates has one column for each
nonanchor dimension and one column for each member of the anchor
dimension that stores the actual multidimensional data. The number of
columns in the fact table varies according to the multidimensional model.
Using SQL to Access the DB2 OLAP Server Data Store 323
Figure 224 on page 324 shows the fact tables and dimension tables in the
star schema for the TBC inventory database.
Figure 224. Relational Cube (Star Schema) for the TBC Inventory Model
As shown in Figure 224, the fact table for the Inventory relational cube
consists of one column for each of the Time, Product, Market, and Scenario
dimensions, and one column for each of the members of the anchor
dimension (Measures), namely, Opening Inventory, Additions, and Ending
Inventory. The dimension columns store member IDs that reference members
of each nonanchor dimension. The anchor member columns store the actual
data values.
The fact view, a simple view of the fact table, is created for every relational
cube. This view can be used to directly access the multidimensional data
using SQL applications that can manage the required joins to the dimension
views.
324 Visual Warehouse & DB2 OLAP Server
The star view, which is also created for each relational cube, joins the fact
table to each of the dimension views. The star view maps the internal column
names in the fact table to corresponding dimension and member names and
maps the member IDs to member names.
Although any dense dimension can be specified as the anchor dimension, the
most natural mapping is obtained when the Accounts dimension is specified
as the anchor dimension.
Table 20 shows the structure of the fact view.
Table 20. Structure of the Fact View
Name Type Contents
For nonanchor dimension columns: Integer RELMEMBERID of the
The RELDIMENSIONNAME column member of the nonanchor
of the cube view is used. dimensions in the data
For anchor dimension members: Double The data value for the
The RELMEMBERNAME column of combination of nonanchor
the dimension view of the anchor dimensions and the anchor
dimension is used. dimension member
Because the fact view contains RELMEMBERIDs, the SQL applications
should keep track of the RELMEMBERIDs to query the fact view directly.
Some examples of SQL queries are shown in Figure 225 on page 326:
• Get the fact view name for a specific cube
• Join the fact view with the dimension views
The fact table does not have rows pertaining to members marked as Dynamic
Calc in the outline.
Using SQL to Access the DB2 OLAP Server Data Store 325
Figure 225. SQL Queries on the Fact View
The results for the numerical values are of datatype DOUBLE. They can be
casted to any format desired for the output by using, for example:
INT(OPENING_INVENTORY).
Table 21shows the structure of the star view.
Table 21. Structure of the Star View
Name Type Contents
For nonanchor dimension columns: Varchar(80) Member name as defined in
The RELDIMENSIONNAME column the outline
in the cube view is used.
326 Visual Warehouse & DB2 OLAP Server
Name Type Contents
For anchor dimension members: Double The data value for the
The RELMEMBERNAME column of combination of nonanchor
the dimension view of the anchor dimensions and the anchor
dimension is used. dimension member
The star view is ideal for simple ad hoc queries on the multidimensional data
because the dimension views are already joined with the fact view and hence
allows a more natural way to query data than the fact view.
Figure 226 shows some sample queries based on the star view of the TBC
inventory model.
Using SQL to Access the DB2 OLAP Server Data Store 327
Figure 226. SQL Queries on the Star View
Because the fact table contains values with different levels of aggregation,
you should ensure, while writing an SQL application for aggregating data, that
the set of members selected in each dimension view has the same level of
aggregation. One way to ensure that is to have a constraint on the generation
number or level number field in the dimension view.
Let us look at an example where an aggregation could go wrong, if it is done
on members that are not at the same hierarchical level in the outline. In
Figure 227, the first SQL shows an aggregation of additions to the inventory
in New York and Michigan for product classes 100-10, 100-20, and 100-30
and product group 100. The problem here is that the total additions for
product group 100 already include the additions for 100-10, 100-20, and
328 Visual Warehouse & DB2 OLAP Server
100-30. Effectively, the additions for the product classes above are
aggregated twice in this SQL.
The second SQL in Figure 227 shows the correct aggregation, which includes
members at the same level, namely the product class level.
Figure 227. SQL to Perform an Aggregation Operation
If we are only interested in the member names in the dimension views, we
can use the star view because it already has the member names along with
the data. However, if we are interested in other things, such as the
PARENTRELID, LEVELNUMBER, and GENERATIONNUMBER of the
members, we can join the fact view with the dimension views to get the
required details.
15.3.5 Querying the UDA Views
DB2 OLAP Server creates and maintains one UDA view for every dimension
of a cube. There is one row for every member/UDA combination.
Table 22 shows the structure of the UDA view.
Table 22. Structure of the UDA View
Name Type Max Contents
Size
MEMBERNAME VARCHAR 80 Member name as defined in the
outline
Using SQL to Access the DB2 OLAP Server Data Store 329
Name Type Max Contents
Size
UDA VARCHAR 80 UDA text string as defined in the
outline
Figure 228 shows some interesting sample queries against the UDA view.
Figure 228. SQL Queries on the UDA View
15.3.6 Other Views
The other views that might be useful in SQL applications are the alias ID view
and the LRO view, the structures of which are listed, respectively in, Table 23
on page 331 and Table 24 on page 331.
There is one alias ID view for each relational cube, and it has one row for
each DB2 OLAP Server alias table used in the outline.
330 Visual Warehouse & DB2 OLAP Server
Table 23. Structure of the Alias ID View
Name Type Max Contents
Size
ALIASTABLENAME VARCHAR 80 The name of the alias table
defined in DB2 OLAP Server for
the outline. This is the collective
name for a set of aliases
associated with members of a
cube.
RELALIASTABLENAME VARCHAR 18 Name used by DB2 OLAP
Server in the alias column of
the dimension view
There is one LRO view for every relational cube, and it has one row for each
linked object.
Table 24. Structure of the LRO View
Name Type Max Contents
Size
For Dimension columns: INTEGER RELMEMBERID of the member
The RELDIMENSION- in this dimension with which the
NAME column of the cube LRO is associated
view is used.
STOREOPTION SMALLINT 0 - LRO is stored on the client
1 - LRO is stored on the server
OBJTYPE SMALLINT 0 - LRO is an annotation
1 - LRO is application data
HANDLE INTEGER Unique ID for LRO. It can be used
when more than one object is
associated with a cell.
USERNAME VARCHAR 31 Name of the creator of the object
UPDATEDATE INTEGER UTC timestamp when the object
is updated
OBJNAME VARCHAR 512 If LRO is application data, this
column contains the filename of
the object.
OBJDESC VARCHAR 80 If LRO is application data, this
column contains a description of
the object.
Using SQL to Access the DB2 OLAP Server Data Store 331
Name Type Max Contents
Size
NOTE VARCHAR 600 If LRO is annotation, this column
contains the text of the
annotation.
15.4 Advanced SQL against the DB2 OLAP Server Star-Schema
In this section we show some useful examples of advanced SQL techniques
that we have successfully used to manage a common repository of
standardized dimensions from a single point of control, propagating subsets
of these dimensions to other models (see also the discussion in “Common
Data” on page 30).
15.4.1 Traversing a Dimension Hierarchy
One of the requirements for selecting specific subsets of a dimension is to be
able to traverse up (or down) the hierarchy from a given entry level,
specifying either a number of aggregation levels or a certain attribute as the
stop condition.
We now explain how to traverse a dimension hierarchy, using recursive SQL.
The first example shown in Figure 229 on page 333 traverses up in the
Market hierarchy of the expanded TBC sales model from a given level. It
traces all parents until the root level (Market) is reached. The Market
dimension view is TBC_EXPA@MARKET. This example shows a zipcode
60989 (level 0) as a starting point in the hierarchy, and the result shows
Madison (level 1), Wisconsin (level 2), Central (level 3), and Market (level 4).
If we do not include level constraint "level < 5," we get an SQL warning about
the possibility of an infinite loop.
The second example shown in Figure 229 on page 333 uses a UDA
constraint instead of a level constraint to limit the search. This SQL traverses
up in the Market hierarchy and lists all parents until a parent with a UDA of
Major Market is reached. The starting point shown is New York (level 2), and
the next major market is East (level 3). To specify the UDA constraint, we
need to join the UDA view TBC_EXPA@@MARKET with the Market
dimension view as shown.
For a detailed description of recursive SQL, refer to the SQL Reference for
DB2 UDB Version 5, S10J-8165.
332 Visual Warehouse & DB2 OLAP Server
Figure 229. Traversing a Dimension Hierarchy, Using Recursive SQL
15.4.2 Tracking Outline Changes
In a production environment, ideally the outline changes will be made using
dynamic dimension building methods. So, it is important to keep track of what
changes take place in the outline and when they take place. One way of
doing this is to perform the following steps:
• Make a copy of the dimension views. This can also be automated with
Visual Warehouse Business Views.
• Update the outline, using Load Rules to make the required changes.
• Use a full outer join to compare the old dimension views with the new
dimension views, listing all columns from the old and new dimension views
with a timestamp.
Using SQL to Access the DB2 OLAP Server Data Store 333
The output can be used as a log to track the changes that take place in the
outline.
Define the source database, DB2OLAP, in Visual Warehouse. Create a
Business View, Product Hierarchy Backup, with source and target as the
DB2OLAP database. Define the Business View to copy data from the Product
dimension view, TBC1INVD@PRODUCT. Define the target table as
TBC1INVD_PRODBK. Promote it to test and execute the Business View.
Now, we make a change in the outline. Add a new product group, 500, in the
TBC inventory model as shown in Figure 230.
Figure 230. Adding a New Product Group to the TBC Inventory Model
Save the outline. Execute the query shown in Figure 231. Because we are
using a full outer join, even the products that are deleted from the old
dimension view are shown in this list.
334 Visual Warehouse & DB2 OLAP Server
Figure 231. Generating an Outline Change Log File, Using SQL
15.4.3 Drill-Across from Aggregated to Detailed Data
If we have two multidimensional cubes, one aggregated on the month level,
and the other one, a more detailed one, with daily data, and the outlines are
the same except for the Time dimension, it is possible to drill-across from the
aggregated cube to the detailed cube just by using SQL, instead of setting up
a linked partition between them. We can join the star views of both cubes with
a where clause to equate the member names in both cubes.
We can also use SQL Drill-Through to get the detailed data from the fact
table of the daily cube, for instance, from the aggregated monthly cube.
Note
End of the section specific to IBM DB2 OLAP Server.
Using SQL to Access the DB2 OLAP Server Data Store 335
336 Visual Warehouse & DB2 OLAP Server
Chapter 16. OLAP Analysis over the Web Using Wired for OLAP
One of the major advantages of Web-based solutions is the ease of
deployment. Business Intelligence solutions increasingly make use of internet
technology to offer OLAP analysis capabilities to a broad end-user
community and to benefit from the low-maintenance client environment.
Hyperion Wired for OLAP is a suite of components that deliver powerful and
flexible OLAP analysis capabilities on the Windows platform as well as using
Web browser technology, together with administration tools, custom design
tools, and a mid-tier application server.
To use the Web-based analysis component, the client machines only have to
be equipped with a standard Web browser, such as Microsoft Internet
Explorer or Netscape Navigator. Ideally, the browsers should support at least
Java Release 1.1 (JDK 1.1).
To access the Wired for OLAP environment, end users just have to know the
corresponding Web address (URL) and their Wired user ID and password.
After the end user has connected to the URL, the Wired for OLAP applet is
downloaded to the client from the Web, and a connection to the Wired for
OLAP application server is established. The Wired for OLAP application
server should reside on the same system as the Web server. The Wired for
OLAP application server, in turn, connects to DB2 OLAP Server.
16.1 Setup
The setup of Wired for OLAP for the Web is fairly straightforward. Follow
these steps:
1. Install the Wired for OLAP software on the server.
2. Make the Java client code of Wired for OLAP accessible for the browsers
by making the \www subdirectory available to the Web server
environment. You can achieve this either by copying the contents of \www
to a directory in the structure of the Web server or defining a virtual
directory for \www.
3. Use the User and Connection Management component of Wired for OLAP
Administrator to define users and enable them to access certain DB2
OLAP Server applications and databases.
4. Create generic template views, using the Wired for OLAP Analyzer client
for Windows.
© Copyright IBM Corp. 1998 337
5. Group the views using the View Manager of Wired for OLAP Analyzer.
Two options for grouping views are available: view groups and corporate
report groups.
6. Define specific access points for certain end users, using the User and
Connection Management component.
16.2 Concepts
In this section we introduce the concepts of Wired for OLAP. We explain how
reports are organized and how end users can access them. The security
concept of Wired for OLAP is covered in 11.1, “Security Layers of the OLAP
Data Mart” on page 262. The emphasis here is on the organizational and
administrative aspects of the solution. We do not cover the OLAP analysis
capabilities of Wired for OLAP.
16.2.1 Views and View Groups
Views are reports; typically spreadsheets or charts (see Figure 232 and
Figure 233). The full set of OLAP operations (such as pivoting and moving
dimensions, drill down and roll up on spreadsheets and charts, or filtering
based on dimension attributes) is available for views. Each view is associated
with the particular DB2 OLAP Server database from which it gets its data. As
in DB2, views contain no data. They contain only the information Wired for
OLAP needs to re-create the report using data from the OLAP database.
Views are organized into groups, for example, a number of views relating to
product sales and market share might be stored in a view group called
Product Profitability.
The Wired for OLAP View Manager allows you to save, retrieve, modify, and
group views. You have access only to groups you own (that is, groups you
created) or groups that are shared. Only the owner of a group can make
changes to views within it and save the changes back to the group. However,
if you have access to a view, you can always change it and save the modified
view into a new group or into an existing group you own.
When you save a view, you can specify whether the view should be visible
only to you, whether it should be shared, or whether it should show up in the
corporate reports.
338 Visual Warehouse & DB2 OLAP Server
Figure 232. Wired for OLAP Browser Client with Spreadsheet View
Figure 233. Wired for OLAP Browser Client with Chart View
OLAP Analysis over the Web Using Wired for OLAP 339
16.2.2 Template Views
Views are used as a starting point for ad hoc analysis. Thus, create a shared
view group that contains a set of template views, one for each database your
end-user community wants to access. Then, to create a new view, end users
can simply open the appropriate template view, make the desired changes,
and save the new view into a group they own.
16.2.3 Corporate Report Groups
To enable you to work with a collection of related views, Wired for OLAP
provides the corporate reports feature. Corporate reports can be thought of
as a set of briefing books. Groups that are designated as corporate report
groups show up as icons. Clicking on an icon will bring up all of the views
within that group that are designated as corporate report views. You can
easily navigate through these views by scrolling backward and forward.
The Windows client of Wired for OLAP Analyzer must be used to manage
corporate report groups and views.
Note that some custom display types, such as forms and pinboards, provided
by the Windows client are not supported by the Web browser client of Wired
for OLAP Analyzer. Views based on these special-purpose display types are
automatically filtered out of groups when they are accessed through the
browser client.
16.2.4 User Access to View Groups and Corporate Report Groups
You can specify the access point for each individual end user. After the end
user has logged on successfully, Wired for OLAP provides one of the
following access points according to your definition:
• The end user’s default view will be loaded.
• Wired for OLAP View Manager will be displayed.
• The corporate report menu will be displayed.
• The end user’s default view group will be displayed.
• The end user will be able to choose whether to load a view or start
corporate reports.
To define these access points, open Wired for OLAP Administrator, select the
User and Connection Management component, click User Settings..., select
the Startup tab, select Startup Options, and click Change.
340 Visual Warehouse & DB2 OLAP Server
16.2.5 Printing
The JDK 1.1 version of the Wired for OLAP Analyzer browser client contains
advanced report printing functions. It offers printer selection with
printer-specific options, including paper size, paper orientation, and number
of copies. It supports all Windows printers. You can customize report headers
and footers, page margins, and fonts. A print preview is available, too.
Reports can be saved in HTML format or copied to the clipboard. Traffic
lighting (see Figure 232) is supported on printed reports as well.
OLAP Analysis over the Web Using Wired for OLAP 341
342 Visual Warehouse & DB2 OLAP Server
Appendix A. Special Notices
This publication is intended to help Business Intelligence project managers,
solution architects, and OLAP specialists understand how to build an
end-to-end Business Intelligence solution based on IBM DB2 OLAP Server
and IBM Visual Warehouse. The information in this publication is not intended
as the specification of any programming interfaces that are provided by IBM
DB2 OLAP Server, IBM Visual Warehouse, Hyperion Essbase, or Hyperion
Wired for OLAP. See the PUBLICATIONS section of the IBM Programming
Announcement for IBM DB2 OLAP Server and IBM Visual Warehouse for
more information about what publications are considered to be product
documentation.
References in this publication to IBM products, programs or services do not
imply that IBM intends to make these available in all countries in which IBM
operates. Any reference to an IBM product, program, or service is not
intended to state or imply that only IBM's product, program, or service may be
used. Any functionally equivalent program that does not infringe any of IBM's
intellectual property rights may be used instead of the IBM product, program
or service.
Information in this book was developed in conjunction with use of the
equipment specified, and is limited in application to those specific hardware
and software products and levels.
IBM may have patents or pending patent applications covering subject matter
in this document. The furnishing of this document does not give you any
license to these patents. You can send license inquiries, in writing, to the IBM
Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY
10504-1785.
Licensees of this program who wish to have information about it for the
purpose of enabling: (i) the exchange of information between independently
created programs and other programs (including this one) and (ii) the mutual
use of the information which has been exchanged, should contact IBM
Corporation, Dept. 600A, Mail Drop 1329, Somers, NY 10589 USA.
Such information may be available, subject to appropriate terms and
conditions, including in some cases, payment of a fee.
The information contained in this document has not been submitted to any
formal IBM test and is distributed AS IS. The information about non-IBM
("vendor") products in this manual has been supplied by the vendor and IBM
assumes no responsibility for its accuracy or completeness. The use of this
© Copyright IBM Corp. 1998 343
information or the implementation of any of these techniques is a customer
responsibility and depends on the customer's ability to evaluate and integrate
them into the customer's operational environment. While each item may have
been reviewed by IBM for accuracy in a specific situation, there is no
guarantee that the same or similar results will be obtained elsewhere.
Customers attempting to adapt these techniques to their own environments
do so at their own risk.
Any pointers in this publication to external Web sites are provided for
convenience only and do not in any manner serve as an endorsement of
these Web sites.
Any performance data contained in this document was determined in a
controlled environment, and therefore, the results that may be obtained in
other operating environments may vary significantly. Users of this document
should verify the applicable data for their specific environment.
The following document contains examples of data and reports used in daily
business operations. To illustrate them as completely as possible, the
examples contain the names of individuals, companies, brands, and
products. All of these names are fictitious and any similarity to the names and
addresses used by an actual business enterprise is entirely coincidental.
Reference to PTF numbers that have not been released through the normal
distribution process does not imply general availability. The purpose of
including these reference numbers is to alert IBM customers to specific
information relative to the implementation of the PTF when it becomes
available to each customer according to the normal IBM PTF distribution
process.
The following terms are trademarks of the International Business Machines
Corporation in the United States and/or other countries:
AIX AS/400
DATABASE 2 DataGuide
DataJoiner DataPropagator
DB2 IMS
Intelligent Miner Net.Data
OS/390 RS/6000
Visual Warehouse
IBM
The following terms are trademarks of other companies:
344 Visual Warehouse & DB2 OLAP Server
Java and all Java-based trademarks and logos are trademarks or registered
trademarks of Sun Microsystems, Inc. in the United States and/or other
countries.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of
Microsoft Corporation in the United States and/or other countries.
PC Direct is a trademark of Ziff Communications Company in the United
States and/or other countries and is used by IBM Corporation under license.
ActionMedia, LANDesk, MMX, Pentium and ProShare are trademarks of Intel
Corporation in the United States and/or other countries.
UNIX is a registered trademark in the United States and/or other countries
licensed exclusively through X/Open Company Limited.
SET and the SET logo are trademarks owned by SET Secure Electronic
Transaction LLC.
Oracle is a trademark of Oracle Corporation.
Hyperion and Hyperion Pillar are registered trademarks, and Hyperion
Solutions and Hyperion Enterprise are trademarks of Hyperion Software
Operations Inc., a wholly-owned subsidiary of Hyperion Solutions
Corporation.
Arbor and Essbase are registered trademarks and WIRED for OLAP,
Hyperion Essbase, Hyperion Web Gateway, Hyperion Objects, and Hyperion
Essbase Adjustment Module are trademarks of Hyperion Solutions
Corporation.
Brio, Brio Query, and Brio Enterprise are trademarks of Brio Technology Inc.
The BusinessObjects logo, BusinessObjects, and BusinessQuery are
registered trademarks of Business Objects SA.
Cognos and PowerPlay are trademarks of Cognos.
Seagate Holos is a trademark of Seagate Technology Inc.
Integrity is a trademark of Vality Technology Inc.
Special Notices 345
Other company, product, and service names may be trademarks or
service marks of others.
346 Visual Warehouse & DB2 OLAP Server
Appendix B. Related Publications
The publications listed in this section are considered particularly suitable for a
more detailed discussion of the topics covered in this redbook.
B.1 International Technical Support Organization Publications
For information on ordering these ITSO publications see “How to Get ITSO
Redbooks” on page 349.
• Data Modeling Techniques for Data Warehousing, SG24-2238
• DB2 Meets Windows NT, SG24-4893
• Internet Security in the Network Computing Framework, SG24-5220
B.2 Redbooks on CD-ROMs
Redbooks are also available on the following CD-ROMs. Click the CD-ROMs
button at http://www.redbooks.ibm.com/ for information about all the CD-ROMs
offered, updates and formats.
CD-ROM Title Collection Kit
Number
System/390 Redbooks Collection SK2T-2177
Networking and Systems Management Redbooks Collection SK2T-6022
Transaction Processing and Data Management Redbooks Collection SK2T-8038
Lotus Redbooks Collection SK2T-8039
Tivoli Redbooks Collection SK2T-8044
AS/400 Redbooks Collection SK2T-2849
Netfinity Hardware and Software Redbooks Collection SK2T-8046
RS/6000 Redbooks Collection (BkMgr) SK2T-8040
RS/6000 Redbooks Collection (PDF Format) SK2T-8043
Application Development Redbooks Collection SK2T-8037
IBM Enterprise Storage and Systems Management Solutions SK3T-3694
B.3 Other Publications
These publications are also relevant as further information sources:
• Kimball, Ralph, The Data Warehouse Toolkit, John Wiley & Sons, Inc.,
New York, N.Y. 1996, ISBN 0-471-15337-0
• Kimball, Ralph, et al., The Data Warehouse Lifecycle Toolkit, John Wiley &
Sons, Inc., New York, N.Y. 1998, ISBN 0-471-25547-5
© Copyright IBM Corp. 1998 347
• Essbase Database Administrator’s Guide, Volume I, SC26-9238
• Essbase Database Administrator’s Guide, Volume II, SC26-9286
• Essbase Version 5 Installation Notes, GC26-9237
• Essbase SQL Drill-Through Guide, SC26-9244
• Essbase SQL Interface Guide, SC26-9243
• Essbase Spreadsheet Add-in User’s Guide for 1-2-3, SC26-9242
• Essbase Spreadsheet Add-in User’s Guide for Excel, SC26-9240
• Essbase Objects Getting Started, GC26-9285
• Essbase Adjustment Module Installation Guide, GC26-9247
• Essbase Adjustment Module System Administrator’s Guide, GC26-9249
• Essbase Adjustment Module User’s Guide, GC26-9248
• Essbase Web Gateway Installation Notes, SC26-9245
• Using DB2 OLAP Server, SC26-9235
348 Visual Warehouse & DB2 OLAP Server
How to Get ITSO Redbooks
This section explains how both customers and IBM employees can find out about ITSO redbooks,
redpieces, and CD-ROMs. A form for ordering books and CD-ROMs by fax or e-mail is also provided.
• Redbooks Web Site http://www.redbooks.ibm.com/
Search for, view, download, or order hardcopy/CD-ROM redbooks from the redbooks Web site. Also
read redpieces and download additional materials (code samples or diskette/CD-ROM images) from
this redbooks site.
Redpieces are redbooks in progress; not all redbooks become redpieces and sometimes just a few
chapters will be published this way. The intent is to get the information out much quicker than the
formal publishing process allows.
• E-mail Orders
Send orders by e-mail including information from the redbooks fax order form to:
e-mail address
In United States usib6fpl@ibmmail.com
Outside North America Contact information is in the “How to Order” section at this site:
http://www.elink.ibmlink.ibm.com/pbl/pbl/
• Telephone Orders
United States (toll free) 1-800-879-2755
Canada (toll free) 1-800-IBM-4YOU
Outside North America Country coordinator phone number is in the “How to Order”
section at this site:
http://www.elink.ibmlink.ibm.com/pbl/pbl/
• Fax Orders
United States (toll free) 1-800-445-9269
Canada 1-403-267-4455
Outside North America Fax phone number is in the “How to Order” section at this site:
http://www.elink.ibmlink.ibm.com/pbl/pbl/
This information was current at the time of publication, but is continually subject to change. The latest
information may be found at the redbooks Web site.
IBM Intranet for Employees
IBM employees may register for information on workshops, residencies, and redbooks by accessing
the IBM Intranet Web site at http://w3.itso.ibm.com/ and clicking the ITSO Mailing List button.
Look in the Materials repository for workshops, presentations, papers, and Web pages developed
and written by the ITSO technical professionals; click the Additional Materials button. Employees
may access MyNews at http://w3.ibm.com/ for redbook, residency, and workshop announcements.
© Copyright IBM Corp. 1998 349
IBM Redbook Fax Order Form
Please send me the following:
Title Order Number Quantity
First name Last name
Company
Address
City Postal code Country
Telephone number Telefax number VAT number
Invoice to customer number
Credit card number
Credit card expiration date Card issued to Signature
We accept American Express, Diners, Eurocard, Master Card, and Visa. Payment by credit card not
available in all countries. Signature mandatory for credit card payment.
350 Visual Warehouse & DB2 OLAP Server
List of Abbreviations
API application RDBMS relational database
programming interface management system
BV Business View R-OLAP relational OLAP
CAE Client Application RSM relational storage
Enabler manager
DDL data definition SMP symmetrical
language multiprocessor
DMS database managed SQL structured query
storagespace language
FTP File Transfer Protocol TBC The Beverage
Company
H-OLAP hybrid OLAP
UDA user-defined attribute
HTML hypertext markup
language UDB Universal Database
IBM International Business UDF user defined function
Machines Corporation VWP Visual Warehouse
IT information technology Program
ITSO International Technical XMI XML Metadata
Support Organization Interchange
JDK Java Developers Kit XML extended markup
language
LRO linked reporting object
MDIS Metadata Interchange
Specification
M-OLAP multidimensional OLAP
MPP massive parallel
processing
ODBC Open Database
Connectivity
OLAP online analytical
processing
OLE object linking and
embedding
OLTP online transaction
processing
OMG Object Management
Group
© Copyright IBM Corp. 1998 351
352 Visual Warehouse & DB2 OLAP Server
Index
operation component 33
publication component 35
A staged data 30
Adjustment Module 27
transformation component 29
agents 38
visualization component 35
anchor dimension 24, 235
Business Intelligence project
API
scope 16
Essbase 21, 24
business project group 11
applet 337
business project leader 11
application areas 4
business rules 10
application component 19
business subject area 16
Application Manager 26
Business Subject Area Specialist 13
application programming interface
business subject data 31
See API
Business View 102
application-layer gateways 262
architecture
client/server 7 C
end-to-end 19 cache
fat-client 20 calculator 234
OLAP 19 data 233
single-tier 20 index 232
thin-client 21 calculation
three-tier 21 dynamic 240
two-tier 20 dynamic and store 240
order 242
outline 146
B two-pass 149
backup component 32
calculation process
block sizes 221
tuning 238
business definitions 10
calculation scripts 89, 142
Business Intelligence applications 33
cleansing component 29
Business Intelligence architecture framework
Codd, E.F. 5
administration component 33
OLAP rules 5
architecture building blocks 28
commands
archive component 32
computation 143
automation component 33
control flow 142
backup component 32
data declaration 142
Business Intelligence applications 33
common data 30
business subject data 31
communication structure 12, 17
cleansing component 29
consolidation 49
common data 30
cube catalog 313
control component 33
Currency Module 27
data access component 29
data sources 29
metadata repository 33 D
multidimensional data 31 data
navigation component 35 access component 29
OLAP engine 32 blocks 24, 220
© Copyright IBM Corp. 1998 353
quality 10 F
sources 29 fact table 22
staging 30 financial services industry 3
store 19 firewall 262
data load flexible reporting 8
rules 56 formulas 241
tuning 236 framework 19
Data Prep Editor 56 functions 135
Database Administrator 13 boolean 140
DB2 utilities conditional 135
REORG 242 financial 138
RUNSTATS 242 index 137
deliverables 17 macro 139
delivery cycle 9 mathematical 136
dense dimension 235 range 141
Desktop OLAP 20
development process
iterative 16 G
generation reference 54
development project group 12
dimension tables 22
dimensions H
dense 235 H-OLAP 20
Market 67 Hybrid OLAP
Measures 47 See H-OLAP
Product 56
relational anchor 235
sparse 220
I
indexspace 231
Year 72
intelligent calculation 242
drill across 161
drill down 6, 269
drill up 269 L
drill-through 295 launch table 55, 113
dynamic calculation 240 level reference 54
dynamic calculation and store 240 linked partitions 248
dynamic dimension building log file
generation reference 54 application 259
launch table 55 exception error 259
level reference 54 outline change 259
parent/child references 54 server 259
E M
End User Representative 12 manufacturing industry 4
Essbase Market dimension 67
Adjustment Module 26 Measures dimension 47
API 24 member selection 287
Currency Module 26 members of a dimension 23
Objects 26 metadata repository 33
Extract Programmer 14 M-OLAP 20
354 Visual Warehouse & DB2 OLAP Server
multidimensional transparent 161, 181, 248
data 31 performance tuning 219
OLAP Platform Specialist 13
See M-OLAP precalculation 6
presentation component 19
Product dimension 56
N project
navigate without data 274
manager 12
navigation component 35
members 10
roles 11
O scope 16
Objects 27 status 12
obstacles proxies 262
cultural 17 publication component 35
political 17
OLAP
architecture building blocks 19 R
reference
characterizing 5
generation 54
market 4
level 54
middleware role 6
parent/child 54
operations 6
relational anchor dimension 24
OLAP engine 19, 32
Relational OLAP
OLAP rules
See R-OLAP
accessibility 6
relational storage manager
client/server architecture 7
See RSM
consistent reporting performance 6
REORG 242
cross-dimensional operations 8
replicated partitions 249
dynamic sparse matrix handling 7
requirements 9
generic dimensionality 7
resource coordination 12
intuitive data manipulation 8
retail industry 4
multidimensional
R-OLAP 20
conceptual view 6
roll up 6
multiuser support 8
RSM 24
number of dimensions and aggregation levels 8
RSM.CFG 313
OLAP technology
RUNSTATS 242
reasons for applying 3
OLTP 5
online analytical processing S
See OLAP security 261
online transaction processing security layer
See OLTP analysis application 263
database 264
Internet 262
P OLAP server 263
packet filters 262
SET AGGMISSG 243
parent/child references 54
sign
partitions 161
flip 292
linked 161, 192, 248
skill profiles 10
replicated 161, 249
slice and dice 6
355
Solution Architect 12 agents 38
sparse dimension 220 control database 38
speed of thought navigation 6 server 38
Sponsor 11, 17 target database 39
Spreadsheet Add-in 269 visualization component 35
SQL 32
SQL drill-through 295
client 296
W
Wired for OLAP 337
server 296
Analyzer 337
SQLDRILL.INI 296
corporate reports 340
SQLDrillServer 296
template views 340
staged data 30
user and connection management 337
star-schema 22
view groups 338
structured query language
View Manager 338
See SQL
views 338
success factors 16
systems integration 9
Y
Year dimension 72
T
tablespace
database managed 231
Tools Specialists 14
trace file 259
transformation component 29
transparency 6
transparent partitions 248
U
UDA 283
user-defined attribute
See UDA
V
validation
of quality 17
of results 17
view
alias ID 316, 330
cube 316, 318
cube catalog 316
dimension 316, 319
fact 316, 323
LRO 316, 330
star 316, 323
UDA 316, 329
Visual Warehouse
administrative clients 38
356 Visual Warehouse & DB2 OLAP Server
ITSO Redbook Evaluation
Managing Multidimensional Data Marts with Visual Warehouse and DB2 OLAP Server
SG24-5270-00
Your feedback is very important to help us maintain the quality of ITSO redbooks. Please complete
this questionnaire and return it using one of the following methods:
• Use the online evaluation form found at http://www.redbooks.ibm.com/
• Fax this form to: USA International Access Code + 1 914 432 8264
• Send your comments in an Internet note to redbook@us.ibm.com
Which of the following best describes you?
_ Customer _ Business Partner _ Solution Developer _ IBM employee
_ None of the above
Please rate your overall satisfaction with this book using the scale:
(1 = very good, 2 = good, 3 = average, 4 = poor, 5 = very poor)
Overall Satisfaction __________
Please answer the following questions:
Was this redbook published in time for your needs? Yes___ No___
If no, please explain:
What other redbooks would you like to see published?
Comments/Suggestions: (THANK YOU FOR YOUR FEEDBACK!)
© Copyright IBM Corp. 1998 357
Managing Multidimensional Data Marts with Visual Warehouse and DB2 OLAP Server SG24-5270-00
Printed in the U.S.A.
SG24-5270-00
Related docs
Other docs by blacksadow2
Image and Workflow Library- Content Manager for ImagePlus on OS-390 Implementation and EIP
Views: 165 | Downloads: 1
Get documents about "