Docstoc

Rolap Definition - PowerPoint

Document Sample
Rolap Definition - PowerPoint Powered By Docstoc
					ZYSolutions Corporation
Performance Considerations for Managing and
       Accessing Mission-Critical Data
         Presented By: Tom Peters
Objective


Business Intelligence applications place significant performance demands
on databases for analysis. This session will focus on Best Practices for
data management, information access, and how to prevent performance
problems with your key reporting & analytical applications.




                                                      ZYSolutions Corporation
What is Business Intelligence
           • The Goal
              – Getting to the required information quickly
                and accurately to make decisions
           • The Challenge
              – Too much information that is not in a
                consistent usable format readily available to
                those who need it
           • The Solution
              – Business Intelligence




                                          ZYSolutions Corporation
 Components of Business Intelligence
• Data
  – Data store collection by
    individual applications
     • Operational systems
     • Departmental data


                               • Presentation
                                 –   Scorecards & Dashboards
                                 –   Enterprise Reporting
                                 –   OLAP Analysis
                                 –   Advanced & Predictive Analysis
                                 –   Alerts & Proactive Notification

                                              ZYSolutions Corporation
Agenda


•   Definition of Terms
•   Deployment Scenarios
•   Optimizing Performance
•   Business Intelligence Tool Demonstration




                                               ZYSolutions Corporation
Business Intelligence




                        ZYSolutions Corporation
Definition of Terms

Operational Data
• Definition
     – Data collected and stored by operational systems used to run day to
       day business
           – Examples
                –     Warehouse Management
                –     Sales and Distribution
                –     Point Of Sale
                –     Accounting




                                                          ZYSolutions Corporation
Definition of Terms

Departmental Data
• Definition
     – Data collected and stored outside of operational systems
     – Use to either enhance or augment operational systems
           – Examples
                – Excel
                – Microsoft Access




                                                          ZYSolutions Corporation
Definition of Terms

Data Warehouse
• Definition
     – A generic term for a system of storing, retrieving and managing large
       amounts of data
     – Software often includes sophisticated compression and hashing
       techniques for fast searching and filtering
     – Typically remote database containing recent snapshots of operational
       data
     – Data can be used without worrying about impacting performance of
       the operational database
     – Data can be used when Operational systems are offline (i.e. backups,
       upgrades)




                                                          ZYSolutions Corporation
Definition of Terms

Data Warehouse




                      ZYSolutions Corporation
Definition of Terms

Data Mart
• Definition
     – Type of data warehouse designed mainly to address a specific function
       or analytical need
     – Often uses aggregation or summarization of the data to enhance query
       performance
     – May include results of complex calculations or formulas
     – Ability to access the underlying base data to enable drill-down analysis
       as necessary




                                                           ZYSolutions Corporation
Definition of Terms

Data Mart




                      ZYSolutions Corporation
Definition of Terms

Relational Database (RDB)
  • Definition
        – A database that conforms to the relational model




                                                         ZYSolutions Corporation
Definition of Terms
Relational Database Management System (RDBMS)

  • Definition
        – Refers to the software used to create and maintain an RDB
             •   Microsoft SQL Server
             •   Oracle
             •   MySQL
             •   DB/2
             •   Informix




                                                          ZYSolutions Corporation
Definition of Terms

Online Transaction Processing (OLTP)
  • Definition
        – Employed by operational systems
        – Designed to handle high volumes data collection
        – Limited indexing




                                                            ZYSolutions Corporation
Definition of Terms

Extract, Transform and Load (ETL)
  • Definition
        – Method of moving data from Operational and Departmental data
          sources into a Data Warehouse
        – Method of moving data from Data Warehouse to a Data Mart
        – Extract
             – Data extraction and Staging
                      –   Minimize impact on production data sources
        – Transform
             – Convert to format required by data warehouse
                      –   Cleanse data to ensure accuracy
                      –   Convert to different numbering schema
                      –   Validate foreign keys against defined owner
        – Load
             – Load data to target data store
                      –   Follow guidelines as outlined by target data store



                                                                               ZYSolutions Corporation
Definition of Terms

Online Analytical Processing (OLAP)
  • Definition
        – A method of providing fast access to shared multi-dimensional
          data
        – Generically refers to software and applications that provide users
          with the ability to store and access data multi-dimensionally
        – Extensive indexing
        – Types of OLAP
             – MOLAP – Multi-Dimensional OLAP
             – ROLAP – Relational OLAP
             – HOLAP – Hybrid OLAP
        – Examples of OLAP providers
             – Data providers
                      –   SQL Server Analysis Cubes
                      –   Oracle Cubes
             – Application Providers
                      –   MicroStrategy



                                                            ZYSolutions Corporation
Definition of Terms
Multidimensional Online Analytical Processing (MOLAP)

• Definition
      – A multidimensional “cube” method of providing OLAP
      – MOLAP cubes are built for fast data retrieval and are optimal for
        slicing and dicing operations
      – Advantages
            • Excellent performance
            • Can return complex calculations
      – Disadvantages
            • Limited in scope as definition of cube creates boundaries
            • Limited in volume of data
            • Depending upon implementation, may require technical staff to develop and
              maintain



                                                              Geography
                                                   Products




                                                                          ZYSolutions Corporation
Definition of Terms

Relational Online Analytical Processing (ROLAP)
• Definition
      – A multidimensional RDB method of providing OLAP
      – Manipulates the data stored in the relational database to give the
        appearance of MOLAP's slicing and dicing functionality
      – Advantages
            • Does not constrain data
            • No data size limitation
            • Can leverage functions of RDB                                Geography
      – Disadvantages                                                          Products
            • Each request must query the RDB                                      Time
            • ROLAP itself is limited to RDB functionality




                                                             ZYSolutions Corporation
        Definition of Terms

        MOLAP vs. ROLAP
                     MOLAP                                                                              ROLAP

                         Revenue for Laptop Computers                                                      Profit
                                                                Revenue for Laptop Computers
                         In 2008
                                                                                      In 2008
                         At All Stores
                                                                                 At All Stores
                                                                                                                                Revenue for All Electronics
                                                                                                                                In 2003 and Q1 2004
                                                                                                                                At Stores in the NE Region
             Geography
  Products




                                  Revenue for All Electronics                                                                    Category
                                  In 2003 and Q1 2004
                                  At Stores in the NE Region

                                                                                                                                  Total Revenue, Cost, Profit
                               Total Revenue. Cost                                                                                In Jan 2004 and Jan 2003
                               In Jan 2008 and Jan 2009                                                                           At Top 10 Revenue Stores
                               At Top 10 Revenue Stores                                                                           Drill to Category

  MOLAP allows quick slice-and-dice a                                           ROLAP allows for “Analyze Anything” and „Drill Anywhere‟
pre-defined subset of data to view it from                                      using the entire relational database across all dimensions
      many different perspectives                                                              from summary to detail levels



                                                                                                                    ZYSolutions Corporation
Definition of Terms

Hybrid Online Analytical Processing (HOLAP)
• Definition
      – Combination of MOLAP and ROLAP methods
      – Combines the advantages of both MOLAP and ROLAP
      – ROLAP feeds MLOAP as needed
      – Advantages
            •   Excellent performance
            •   Can return complex calculations
            •   Does not constrain data
            •   No data size limitation                                                 Geography

            •   Can leverage functions of RDB                                               Products

      – Disadvantages                                                                           Time
            • Requires programming or additional B.I. software
            • Not available from most B.I. vendors
            • Limited vendor experience in deploying
                                                              Geography
                                                   Products




                                                                          ZYSolutions Corporation
   Definition of Terms

   Database design formats
• Third Normal Form (3NF)
   – Classical OLTP
     modeling technique
   – Minimizes data
                                               Customer
     redundancy through                                                                                   Item
                                        PK    Customer Number
                                                                                                 PK    Item Number
     normalization                            Customer Name
                                                                                                       Item Description
                                              Customer Address
                                                                                                 FK1   Product LIne
   – Maximizes data                     FK1
                                        FK2
                                              State / Province Code
                                              Territory Number
                                                                                                 FK2   Warehouse Number

     collection
                                                                               Sales
                State / Province                                                                                               Product Line
                                                                      PK,FK1   Customer Number
           PK   State / Province Code                                 PK,FK2   Item Number                                PK     Product LIne

                                                                               Quantity
                                                                               Unit Price




                                                Territory                                              Warehouse

                                         PK    Territory Number                                  PK    Warehouse Number




                                                                                                 ZYSolutions Corporation
   Definition of Terms

   Database design formats
• Star Schema
   – Consists of a single Fact table
   – Compound primary key
                                    Customer
   – One segment for each                                                                                    Item
                               PK  Customer Number
     Dimension                                                                                      PK      Item Number
                                         Customer Name
                                                                                                            Item Description
                                         Customer Address
                                                                                                    FK1     Product LIne
                                  FK1    State / Province Code



                                                                           Sales

                                                                 PK,FK1   Customer Number
                                                                 PK,FK2   Item Number
                                                                 PK,FK4   Territory Number
                                                                 PK,FK3   Warehouse Number

                                                                          Quantity
                                                                          Unit Price




                                           Territory                                                      Warehouse

                                    PK    Territory Number                                         PK     Warehouse Number




                                                                                         ZYSolutions Corporation
   Definition of Terms

   Database design formats
• Snowflake Schema
   – Variation on the star
     schema
   – Very large dimension
     tables are             Customer
                                                                                                     Item

     normalized into PK Customer Number                                                     PK      Item Number
                           Customer Name
     multiple tables       Customer Address
                                                                                            FK1
                                                                                                    Item Description
                                                                                                    Product LIne
                                 FK1   State / Province Code



                                                                         Sales

         State / Province                                      PK,FK1   Customer Number                                     Product Line
                                                               PK,FK2   Item Number
    PK   State / Province Code                                 PK,FK4   Territory Number                               PK     Product LIne
                                                               PK,FK3   Warehouse Number

                                                                        Quantity
                                                                        Unit Price




                                         Territory                                                Warehouse

                                  PK    Territory Number                                   PK     Warehouse Number




                                                                                                ZYSolutions Corporation
Definition of Terms

Metadata
• Metadata
      – Data about data
      – Links the Technical world to the Business world
      – Defines the various components and their location to the B.I. tool




                                                           ZYSolutions Corporation
Definition of Terms

Metadata
• Dimension
      – A perspective that can be used to analyze the data
      – Dimensions become more useful when there are many descriptive
        attributes that can be used for analyzing the data. The term
        “Attribute” is often used to describe the extended Dimension
      – Examples
            •   Item Number
            •   Country
            •   Time
            •   Customer Number
            •   Customer Name




                                                       ZYSolutions Corporation
Definition of Terms

Metadata
• Attribute
      – The term “Attribute” is often used to describe an extended
        Dimension comprised of one or more dimensional static values,
        functions and/or dimensions as well as additional features such as
        formatting
      – Examples
            • Item
                 • Item Number
                 • Item Description
            • Customer
                 • Customer Number
                 • Customer Name
            • Invoice Date
                 • Invoice Date/Time [formatted as a date]
            • Current Date
                 • RDBMS system date function


                                                             ZYSolutions Corporation
Definition of Terms

Metadata
• Fact
      – The raw enumerable piece of information about the transaction
        typically a numeric value (usually aggregatable)
      – Examples
            • Quantity Sold
            • Unit Sale Price
            • Unit Cost




                                                        ZYSolutions Corporation
Definition of Terms

Metadata
• Measure
      – The product of one or more Facts
      – Can be the result of a formula derived from the RDB or the B.I. Tool
        analytical engine
      – Enhanced Measures are often called Metrics
      – Examples
            •   Quantity Sold
            •   Unit Sale Price
            •   Unit Cost
            •   Revenue
                  • (Quantity Sold) * (Unit Sale Price)
            • Profit
                  • (Quantity Sold) * ((Unit Sale Price) – (Unit Cost))




                                                                          ZYSolutions Corporation
Definition of Terms

Metadata
• Metric
      – The term “Metric” is often used to describe an extended Measure
        comprised of one or more a measurable static values, functions
        and/or measures as well as additional features such as formatting
      – Examples
            •   Sum(Quantity Sold)
            •   Avg(Unit Sale Price)
            •   Sum(Unit Cost ) ~{Previous Year}
            •   RunningAvg((Balance Outstanding), {Month])




                                                             ZYSolutions Corporation
Business Intelligence




                        ZYSolutions Corporation
Business Intelligence




                                    Geography




                        Products
      Reporting
       Solution




                                   ZYSolutions Corporation
Deployment Scenarios

Scenario comparison
                                                                         Direct       Data          Data       OLAP
                                                                        access to   warehouse       Mart
                                                                          data
                                       Real time (most recent) data       Yes          No            No         No
              Consumes operational resources (during peak hours)          Yes          No            No         No
                                 Provides means to transform data          No          Yes           Yes        Yes
                                    Provides means to cleanse data         No          Yes           Yes        Yes
                   Limits functionality of cross data source analysis     Yes          No            No         No
May produce inconsistent results depending upon time of execution         Yes          No            No         No
                   Can be ideally indexed to meet analytical needs         No          Yes           Yes        Yes
 Provide means to use historical data beyond means of source data          No          Yes           Yes        Yes
               Capture / view net change analysis (data from data)         No          Yes           Yes        Yes
                 Provide offline / off-site support from data source       No          Yes           Yes        Yes
                             Insulated from changes to source data         No        Partial        Partial   Partial
                                                  Supports formulas        No        Partial        Partial     Yes
                                           Supports OLAP functions         No          No            No         Yes
                                     Optimized for Ad Hoc analysis         No          No            No         Yes

                                                                                             ZYSolutions Corporation
Deployment Scenarios

Advanced analytics

                       Reporting Solution
                       Report Presentation


                        Basic Calculations



                       Advanced Analytics




                                             ZYSolutions Corporation
Business Intelligence




                        ZYSolutions Corporation
Optimizing Performance

Data isolation
• Isolate B.I. data from operational data
     – Extract data from the operational sources and store in a separate
       database
     – Use separate database server hardware from operational data
       source




                                                         ZYSolutions Corporation
Optimizing Performance

Optimize database design
• Design format
     – Limit joins for most used dimensions
         – Star Schema
         – Snowflake Schema




                                              ZYSolutions Corporation
Optimizing Performance

Optimize indexes
• Indexes
     – Add indexes for most used dimensions
        – Be aggressive
        – Drop and rebuild for ETL




                                              ZYSolutions Corporation
Optimizing Performance

Horizontal slicing
• Horizontal
     – Separate data by distinct dimensions
        – Typically date range
     – Merge data via logical views




                                              ZYSolutions Corporation
Optimizing Performance

Physical storage
• Data files
     – Store data files on high speed drives
     – Use RAID 5 or RAID10 drives




                                               ZYSolutions Corporation
Optimizing Performance

Physical storage
• Index files
     – Store index files on separate drive from data files
     – Use high speed drives
     – Use RAID 5 or RAID10 drives




                                                             ZYSolutions Corporation
Optimizing Performance

Physical storage
• Transaction logs
     – Commit data after ETL
     – Shrink log files after ETL
     – Store log files on separate disk from data files




                                                          ZYSolutions Corporation
Optimizing Performance

Reporting solution
• Take advantage of OLAP
     – Use a reporting solution that works with OLAP
     – Run advanced analytics within reporting solution NOT database
• Employs a single metadata layer
     – Understands the least cost joins
• Aggregate aware
     – Automatically go after least cost table
• Multi-pass SQL
     – Break up complex requests into smaller multi-pass SQL and join
       results together




                                                         ZYSolutions Corporation
ZYSolutions Corporation

Tom Peters – tpeters@BusinessIntelligence.Pro

				
DOCUMENT INFO
Description: Rolap Definition document sample