An Introduction to Dimensional Data Warehouse Design

Document Sample
An Introduction to Dimensional Data Warehouse Design Powered By Docstoc
					An Introduction to Dimensional Data
         Warehouse Design



                 Presented by
                 Joseph J. Sarna Jr.
                 JJS Systems, LLC
Relational Normal Form

•   Most relational databases are set to 3rd normal form

       1st Normal form – Tables have unique keys and no repeating
        groups or multi-value fields

       2nd Normal form – Every attribute is dependent ont the entire
        key of the table

       3rd Normal form – Attributes are dependent only on the key.
        No derived elements
Northwind Database Model – Relational Format
                                                                                          EmployeeTerritories              Employees
                    Region                                Territories

         PK    RegionID                          PK      TerritoryID                                                 PK    EmployeeID

                                                                                          FK2    TerritoryID         I1    LastName
               RegionDescription                         TerritoryDescription
                                                                                          FK1    EmployeeID                FirstName
                                                 FK1     RegionID
                                                                                                                           Title
                                                                                                                           TitleOfCourtesy
                                                                                                                           BirthDate
                                                                                                                           HireDate
                                                                                                                           Address
                                                                                                                           City
                                                                                                                           Region
                                                                                                                     I2    PostalCode
                                                                                                                           Country
                                                                                                                           HomePhone
                                                                                                                           Extension
                                                                                                                           Photo
                                                                                                                           Notes
                                                                                                                     FK1   ReportsTo
                                                                                                                           PhotoPath




       Categories

 PK    CategoryID                                                                                                          Orders
                                                                                     Order Details                                                               CustomerCustomerDemo
 I1    CategoryName                        Products                                                             PK         OrderID                Customers
       Description                                                              PK,FK1,I2,I1   OrderID                                                           PK,FK2   CustomerID
       Picture                     PK        ProductID                                                          FK1,I1,I2 CustomerID         PK   CustomerID
                                                                                PK,FK2,I4,I3   ProductID                                                         PK,FK1   CustomerTypeID
                                                                                                                FK2,I4,I3 EmployeeID
                                   I3        ProductName                                                        I5        OrderDate          I2   CompanyName
                                                                                               UnitPrice
                                   FK2,I4,I5 SupplierID                                                                   RequiredDate            ContactName
                                                                                               Quantity
                                   FK1,I2,I1 CategoryID                                                         I6        ShippedDate             ContactTitle
                                                                                               Discount
                                             QuantityPerUnit                                                    FK3,I7    ShipVia                 Address
                                             UnitPrice                                                                    Freight            I1   City
                                             UnitsInStock                                                                 ShipName           I4   Region
                                             UnitsOnOrder                                                                 ShipAddress        I3   PostalCode
                                             ReorderLevel                                                                 ShipCity                Country
         Suppliers                           Discontinued                                                                 ShipRegion              Phone
                                                                                                                I8        ShipPostalCode          Fax
  PK    SupplierID
                                                                                                                          ShipCountry
  I1    CompanyName
        ContactName
                                                                                                                                                                 CustomerDemographics
        ContactTitle
        Address                                                                                                                                                  PK   CustomerTypeID
        City
        Region                                                                                                                                                        CustomerDesc
  I2    PostalCode                                                                                                         Shippers
        Country
        Phone                                                                                                        PK    ShipperID
        Fax
        HomePage                                                                                                           CompanyName
                                                                                                                           Phone
Why Dimensional Data Warehouses?

•   Business needs to analyze data so that it can:
    –   Understand trends

    –   Predict future behavior and needs

    –   Personalize contact with customers

    –   Be competitive

•   All of this in a speedy manner, with the ability to do
    “What if’s”
Drawbacks to Relational Data Structures

•   Data is not structured for analytical usage

•   Multiple Joins are resource intensive

•   Missing data from external sources, context history, not
    operational sources
What Is a Dimensional Data Warehouse?

    “A structured repository of validated and integrated
    historical information accessible to business people to
    provide the basis for both tactical and strategic
    business decisions.”

•   Centralized extract and staging
•   Separate from operational system
•   Structured for analysis
•   Historically contexted
Dimensional Data Warehouse Architecture

Relational Data        External Data            Enterprise Data



                    Data Distribution
              Acquisition, Staging, Cleaning,
                      Transformation




                     Data Warehouse                 Analytical
                         Storage                   Applications
Levels of Design

•   Detail Level
    –   Dimensional Normal form
    –   Value and feasibility


•   Analytical Level
    –   Structured for the required analyses


•   Summary Level
    –   Summaries for user requirements
    –   Better response time
Dimensional Normal Form

•   Normalized for maintainability

•   De-normalized for performance, based on rules

•   2 level structure, therefore only one level of joins
    required for queries
Northwind Database Model – Dimensional Format
                                                                                                                                             Employees

                                                                                                                                     PK    EmployeeKey

                                       Territories                              EmployeeTerritories                                        EmployeeID
          Region
                                                                                                                                           LastName
                                PK    TerritoryKey                        PK       EmployeeTerritoryKey                                    FirstName
PK    RegionKey
                                                                                                                                           Title
                                FK1   RegionKey                           FK1      EmployeeKey                                             TitleOfCourtesy
      RegionID
                                      TerritoryID                         FK2      TerritoryKey                                            BirthDate
      RegionDescription
                                      TerritoryDescription                         TerritoryID                                             HireDate
                                                                                   TerritoryDescription                                    Address
                                                                                   RegionKey                                               City
                                                                                   RegionID                                                Region
                                                                                   RegionDescription                                       PostalCode
                                                                                                                                           Country
                                                                                                                                           HomePhone
                                                                                                                                           Extension
                                                                                                                                           Photo
                                                                                                                                           Notes
                                                                                                                                     FK1   ReportsTo
                                                                                                                                           PhotoPath
                                                                                                                                           EmployeeTerritoryKey
                                                                                                                                           TerritoryKey
                                Products                           Order Details                                                           TerritoryID
                                                                                                                                           TerritoryDescription
          Categories      PK    ProductKey                   PK     OrderDetailKey                         Orders                          RegionKey
     PK   CategoryKey                                                                                                                      RegionID
                                ProductID                    FK2    OrderKey                        PK    OrderKey                         RegionDescription
                                ProductName                         OrderID
          CategoryID                                                                                      OrderID
                                QuantityPerUnit                     UnitPrice
          CategoryName                                                                              FK1   CustomerKey
                                UnitPrice                           Quantity
          Description                                                                               FK2   EmployeeID
                                UnitsInStock                        Discount
          Picture                                                                                   FK3   ShipperKey                                                     CustomerCustomerDemo
                                UnitsOnOrder                        ExtendedPrice                                                      Customers
                                ReorderLevel                 FK1    ProductKey                            OrderDate
                                                                                                          RequiredDate       PK    CustomerKey                     PK     CustomerDemographicsKey
                                Discontinued                        ProductID
                          FK1   CategoryKey                         ProductName                           ShippedDate
            Suppliers                                                                                     Freight                  CustomerID                             CustomerKey
                                CategoryID                          QuantityPerUnit
                                                                                                          ShipName                 CompanyName                     FK1    CustomerTypeKey
     PK    SupplierKey          CategoryName                        UnitPrice
                                                                                                          ShipAddress              ContactName                            CustomerDesc
                                Description                         UnitsInStock
                          FK2   SupplierKey                         UnitsOnOrder                          ShipCity                 ContactTitle
           SupplierID                                                                                                              Address
           CompanyName          SupplierID                          ReorderLevel                          ShipRegion
                                CompanyName                         Discontinued                          ShipPostalCode           City
           ContactName                                                                                                             Region
           ContactTitle         ContactName                         CategoryKey                           ShipCountry
                                ContactTitle                        CategoryID                                                     PostalCode
           Address                                                                                                                 Country                        CustomerDemographics
           City                 Address                             CategoryName
                                City                                Description                                                    Phone
           Region                                                                                                                  Fax                            PK    CustomerTypeKey
           PostalCode           Region                              SupplierKey
                                PostalCode                          SupplierID                                                     CustomerTypeID
           Country                                                                                                                 CustomerDesc                         CustomerTypeID
           Phone                Country                             CompanyName                                                                                         CustomerDesc
                                Phone                               ContactName                                              FK1   CustomerDemographicsKey
           Fax
           HomePage             Fax                                 ContactTitle
                                HomePage                            Address
                                                                    City
                                                                                                               Shippers
                                                                    Region
                                                                    PostalCode                            PK   ShipperKey
                                                                    Country
                                                                    Phone                                      ShipperID
                                                                    Fax                                        CompanyName
                                                                    HomePage                                   Phone
Dimensional Schema

•   Fact Tables
    –   contain related measures
    –   Usually the largest tables
    –   Usually appended to
    –   Can contain detail or summary data
    –   Measures are usually additive
•   Dimension Tables
    –   Contain descriptors
    –   Utilize business terminology
    –   Textual and discrete data
    –   Attributes through which the table measures are analyzed
Northwind Database – Star Schema – Orders
                                    dimDate
                         PK   DateKey

                              DayDate
                              DayDate_YYYYMMDD
                              DayOfWeekName
                              DayOfWeekNameAbbrv
                              DayNumberInWeek              dimEmployees
                              DayNumberInMonth
                              DayNumberInQuarte       PK   EmployeeKey
                              DayNumberInYear
                              WeekDayIndicator             EmployeeID
                              WeekEndIndicator             LastName
                              Week_YYYYWW                  FirstName
                              WeekNumberInYear             Title
                              Month_YYYYMM                 TitleOfCourtesy
                              MonthName                    BirthDate
                              MonthNameAbbrv               HireDate
                              MonthNumberInYear            Address
                              Quarter_YYYYQ                City
                              QuarterName                  Region
                              QuarterNameAbrv              PostalCode
                              QuarterNumberInYear          Country
   dimOrderDetails            Year                         HomePhone
                                                           Extension
  PK   ProductKey                                          Photo
                                                           Notes
       OrderID                                             ReportsTo
       UnitPrice                                           PhotoPath
       Quantity                      fctOrders
                                                           TerritoryID
       Discount               PK    OrderKey               TerritoryDescription
       ExtendedPrice                                       RegionID
       ProductID              FK3   ProductKey             RegionDescription
       ProductName            FK2   EmployeeKey
       QuantityPerUnit        FK1   CustomerKey
       UnitPrice              FK4   ShipperKey
       UnitsInStock           FK6   OrderDateKey
       UnitsOnOrder           FK5   RequiredDateKey
       ReorderLevel           FK7   ShippedDateKey
       Discontinued                 OrderID
       CategoryID                   ShipVia
       CategoryName                 Freight
       Description                  ShipName                            dimCustomers
       SupplierID                   ShipAddress
       CompanyName                                                 PK    CustomerKey
                                    ShipCity
       ContactName                  ShipRegion
       ContactTitle                                                      CustomerID
                                    ShipPostalCode
       Address                                                           CompanyName
                                    ShipCountry
       City                                                              ContactName
       Region                                                            ContactTitle
       PostalCode                                                        Address
       Country                                                           City
       Phone                                                             Region
       Fax                                                               PostalCode
       HomePage                                                          Country
                                    dimShippers
                                                                         Phone
                               PK    ShipperKey                          Fax
                                                                         CustomerTypeID
                                     ShipperID                           CustomerDesc
                                     CompanyName
                                     Phone
Resources

•   Books
    –   The Data Warehouse Toolkit, Ralph Kimball
    –   The Data Warehouse Lifecycle Toolkit, Ralph Kimball, et al
    –   Data Warehouse Design Solutions, Adamson / Venerable


•   Websites
    –   http://www.ralphkimball.com/
    –   http://www.atre.com
    –   http://www.microsoft.com/sql/evaluation/overview/datawa
        re.asp