Docstoc

BizTalk

Document Sample
BizTalk Powered By Docstoc
					 Atlanta Microsoft Database Forum
Introduction to Data Warehousing Concepts



                     Presented by
                     Brian Thomas
                     Solution Builders, Inc.
                     March 8, 2004
                     Brian.Thomas@SolutionBuilders.com
          What is a Data Warehouse?

Data collected from one or many systems that exist within
and outside the organization. The Data is structured in
such a way as to reduce the amount of time that it takes to
produce reliable information.
        Why Build a Data Warehouse?

• To Provide a Consistent Common Source for Corporate
Information

• To Store Large Volumes of Historical Detail Data from
Mission Critical Applications

• Improve the Ability to Access, Report Against, and
Analyze Information

• To Solve or Improve Upon Business Processes
Turning Data into Information
          Functional Data Warehouse




Sales System   System Generated   Sales Analysis is extrapolated
                    Reports         from the System Reports.
Turning Data into Information
          Functional Data Warehouse




Sales System   Functional Data Warehouse
                  of Sales Information
                                           Sales Information is available to a
                                           wider audience of decision makers.
                             Turning Data into Information
                  Cross Organizational Functional Data Warehouse
Division A




              Sales System
 Division B




                                                                        Analysis performed and
              Sales System
                                                                         Decisions drawn from
                                               Centralized Data        the Cross Organizational
 Division C




                                             Warehouse of Sales Data           Sales Data
                                                from across the
                                                 Organization


              Sales System
      Turning Data into Information
            Cross Functional Data Warehouse




Marketing System




   Sales System
                                         Corporate Performance
                                         Analysis is extrapolated
                                        from the System Reports.




Production Systems   System Generated
                          Reports
             Turning Data into Information
                     Cross Functional Data Warehouse




Marketing System




   Sales System            Cross Functional
                           Data Warehouse
                            of Information

                                              Corporate Performance
                                              Analysis is available to a
                                                 wider audience.
Production Systems
             Turning Data into Information
             Cross Organizational & Cross Functional
                        Data Warehouse
Division A
Division B




                                                 Analysis is performed and
                                                 Decisions made from the
                                                     Cross Functional
                             Centralized Cross
                                                      Organizational
                              Functional Data
                                                    Performance Data
                                Warehouse
Division C




                              of Information
                                                              Data Warehouse Architecture
                                                                                                                                                                                                                            Management      Access
      Source Systems                                                        Data Warehouse Components                                                                                                                        Systems        Methods
Division A




                                                               Corporate                                                                                                                                                      Planning &      Portal /
                                                                                                                                                                                                                              Forecasting




                                                                                                                                                                                  Data Access & Query Management Services
                                                                                     Enterprise Data Warehouse                                                                                                                              Web Interface
                                                                 Level
                       Extraction Transformation Load (ETL)




                                                                                                                                                                                                                                                     `
                                                                                                                                                                                                                              Analytics &
Division B




                                                                                                                                                                                                                               Modeling




                                                                                                                                      Increased Local Specifications
                                                                                                                                                                                                                                             Desktop
                                                                                                                                                                                                                                            Applications

                                                                                      Increased Level of Standardization
                                                               Business                                                                                                                                                      Performance
                                                                Group      DW / DM                                          DW / DM                                     DW / DM                                              Management
Division C




                                                                Level
                                                                                                                                                                                                                                               Printed
                                                                                                                                                                                                                                               Reports

                                                                                                                                                                                                                             Scorecards &
                                                                                                                                                                                                                              Dashboards

                                                                                                                                                                                                                                                Email
External Data




                                                              Divisional
                                                                Level      DM   DM                                         DM    DM                                    DM    DM                                               Query &
                                                                                                                                                                                                                              Reporting
                                                                                                                                                                                                                                               Mobile
                                                                                                                                                                                                                                               Devices
                 Data Warehouse Architecture
Source Systems
Division A




                                    Extract, Transformation and Load (ETL)
                     Data Staging                                            Data Warehouse
                        Area                                                   Repository
Division B
Division C
External Data
           Data Warehouse Architecture
                          Data Staging Area


• Subject Area Oriented

• Data Structure more closely mirrors
Operational System Data Layouts

• Supports Identification of Changed Data

• Acts as a Working Area to Support the
Transformation Process
          Data Warehouse Architecture
         Extraction, Transformation & Load (ETL)




                                                   Extract, Transformation and Load (ETL)
• Perform Attribute Standardization and
Cleansing

• Apply Business Rules and Calculations

• Consolidate using Matching and Merge / Purge
Logic

• Ensure Proper Linking and Tracking of
History
          Data Warehouse Architecture
     Extraction, Transformation & Load (ETL)

App. A: Male , Female
App. B: 1 , 0
App. C: x , y
                                                             Male, Female
App. D: m , f
                              Lookup Function
App. A: pipeline (cm)
App. B: pipeline (inches)
App. C: pipeline (mcf)                                      pipeline (cm)
App. D: pipeline (yds)
                            Conversion Function

App. A: Date (julian)
App. B: Date (yyyymmdd)
                                                            Date (julian)
App. C: Date (mm/dd/yyyy)
App. D: Date (absolute)
                            Formatting Function
App. A: Description
App. B: Description                                         Description
App. C: Description
App. D: Description                      Merging Function

App. A: balance on hand                                     Balance
App. B: current balance
App. C: cash in house
App. D: balance                    Mapping Function
           Data Warehouse Architecture
                  Data Warehouse Repository


• Organized around Conformed Dimensions and
Facts

• Promotes Usability and Intuitiveness

• Consolidated and Cross-Functional

• Historical and Atomic Representation of Data

•Insulated from Source System Modifications
and Additions
Data Warehouse Repository
    Star Schema Concepts
      Fact Table
                   This table is the core of the Star
                   Schema Structure and contains
                   the Facts or Measures available
                   through the Data Warehouse.

                   These Facts answer the questions
                   of “What”, “How Much”, or
                   “How Many”.

                   Some Examples:
                       Sales Dollars, Units Sold, Gross Profit,
                       Expense Amount, Net Income, Unit Cost,
                       Number of Employees, Turnover, Salary,
                       Tenure, etc.
Data Warehouse Repository
    Star Schema Concepts
       Dimension Tables
                     These tables describe the Facts
                     or Measures. These tables
                     contain the Attributes and may
                     also be Hierarchical.

                      These Dimensions answer the
                      questions of “Who”, “What”,
                      “When”, or “Where”.

                      Some Examples:
                      • Day, Week, Month, Quarter, Year
                      • Sales Person, Sales Manager, VP of Sales
                      • Product, Product Category, Product Line
                      • Cost Center, Unit, Segment, Business, Company
             Data Warehouse Repository
                        Star Schema Concepts
                            Employee_Dim
                            EmployeeKey
                            EmployeeID
                            .
                            .
                            .



Time_Dim                                                 Product_Dim
TimeKey                     Sales_Fact                   ProductKey
TheDate                     TimeKey                      ProductID
.                                                        .
.                           EmployeeKey                  .
.                                                        .
                            ProductKey
                            CustomerKey
                            ShipperKey
                            Required Data
                            (Business Metrics)
                            or (Measures)
          Shipper_Dim       .                    Customer_Dim
          ShipperKey        .                    CustomerKey
                            .
          ShipperID                              CustomerID
          .                                      .
          .                                      .
          .                                      .
                     Data Warehouse Repository
                                     Cube Concepts
Markets Dimension




                    Atlanta

                    Chicago

                    Denver
                                                              Grapes
                                                            Cherries
                     Dallas                              Melons
                                                       Apples

                              Q1      Q2    Q3    Q4
                                   Time Dimension
                     Data Warehouse Repository
                                     Cube Concepts



                                                   Sales
Markets Dimension




                    Atlanta
                                                   Fact

                    Chicago

                    Denver
                                                                  Grapes
                                                                Cherries
                     Dallas                                  Melons
                                                           Apples

                              Q1      Q2    Q3    Q4
                                   Time Dimension
               Data Warehouse Repository
                             Storage Concepts
• Relational On-Line Analytical Processing (ROLAP): The information that is
stored in the Data Warehouse is held in a relational structure. Aggregations are
performed on the fly either by the database or in the analysis tool.

• Multidimensional On-Line Analytical Processing (MOLAP): This information
is aggregated in a predefined manner based on the characteristics of the
Measures and the defined hierarchy of the Dimensions. Since the data is pre-
aggregated, navigating through the hierarchies is instantaneous. The user is
simply navigating to a point within the Multidimensional Cube and not
performing any on the fly aggregations.

• Hybrid On-Line Analytical Processing (HOLAP): This is a combination of
MOLAP and ROLAP. A portion of the data is predefined and aggregated. This
would typically be the set of information that is accessed most frequently.
Additional detail can be held in a ROLAP structure and allow a user to drill
through the MOLAP structure into the ROLAP structure.
        Data Warehouse Repository
                        Cube Concepts




   Client perspective       MOLAP       HOLAP    ROLAP

Query performance           Fastest     Faster    Fast


Storage consumption          High       Medium    Low
                                                                                         Microsoft Office, Reporting Services and .NET Framework

                SQL Server DTS                                      Where does Microsoft fit in?
                                                                        SQL Server Relational Database and Analysis Services Management                                                                                                          Access
      Source Systems                                                          Data Warehouse Components                       Systems                                                                                                            Methods




                                                                                                                                                                                                                                                                 SharePoint Portal, Exchange, and .NET Framework
Division A




                                                                   Corporate                                                                                                                                                       Planning &      Portal /
                                                                                                                                                                                                                                   Forecasting




                                                                                                                                                                                        Data Access & Query Management Services
                                                                                           Enterprise Data Warehouse                                                                                                                             Web Interface
                                                                     Level
                           Extraction Transformation Load (ETL)




                                                                                                                                                                                                                                                          `
                                                                                                                                                                                                                                  Analytics &
Division B




                                                                                                                                                                                                                                   Modeling




                                                                                                                                            Increased Local Specifications
                                                                                                                                                                                                                                                  Desktop
                                                                                                                                                                                                                                                 Applications

                                                                                            Increased Level of Standardization
                                                                   Business                                                                                                                                                       Performance
                                                                    Group      DW / DM                                            DW / DM                                     DW / DM                                             Management
Division C




                                                                    Level
                                                                                                                                                                                                                                                    Printed
                                                                                                                                                                                                                                                    Reports

                                                                                                                                                                                                                                  Scorecards &
                                                                                                                                                                                                                                   Dashboards

                                                                                                                                                                                                                                                     Email
External Data




                                                                  Divisional
                                                                    Level      DM   DM                                           DM    DM                                    DM    DM                                              Query &
                                                                                                                                                                                                                                   Reporting
                                                                                                                                                                                                                                                    Mobile
                                                                                                                                                                                                                                                    Devices



                      SQL Stored Procedures, SQL Views, MDX, and .NET Web Services
Q&A

				
DOCUMENT INFO