ETIS08 - Process Neutral Data Modelling Concepts - Presentation

Document Sample
ETIS08 - Process Neutral Data Modelling Concepts - Presentation Powered By Docstoc
					                     Data Management & Warehousing



                      PROCESS NEUTRAL DATA MODELLING
                                CONCEPTS

                                DAVID M WALKER
                           ETIS COMMUNITY GATHERING
                        13-14 NOVEMBER 2008 - BRUSSELS


© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels            Page 1
David M Walker                                                              14 November 2008
                     Agenda

•! The Issues With Conventional Data Warehouse
   Data Models
•! Assumptions About The Data Model To Be
   Constructed
•! Requirements Of A Data Warehouse Data Model
•! Constructing The Data Warehouse Data Model




© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels            Page 2
David M Walker                                                              14 November 2009
                     Data Management & Warehousing



                                THE ISSUES WITH
                          CONVENTIONAL DATA WAREHOUSE
                                  DATA MODELS




© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels            Page 3
David M Walker                                                              14 November 2009
                     Issues

•! Data models take a long time to develop
•! Data models are expensive to change
        –! Affects Source -> Data Warehouse ETL
        –! Affects Data Warehouse -> Data Mart ETL
•! The design often reflects the first or largest
   source system
        –! This makes it difficult to add other systems
•! They often reflect current working practice
        –! Making it difficult to change when the business does

© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels            Page 4
David M Walker                                                              14 November 2009
                     Issues

•! A struggle to keep up with rapidly changing
   source system data models
•! Reference data is often not stored in a time
   variant way
•! History is lost with data model changes
•! Queries directly on the data warehouse are
   complex
•! Different rules apply to query each table
•! Different database platforms have different
   needs
© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels            Page 5
David M Walker                                                              14 November 2009
                     Data Management & Warehousing



                                       ASSUMPTIONS ABOUT
                                        THE DATA MODEL
                                       TO BE CONSTRUCTED




© 2008 Data Management & Warehousing       ETIS Community Gathering, Brussels            Page 6
David M Walker                                                                  14 November 2009
                     Assumptions

•! Used in data warehouse
        –! Not in the operational systems or the data marts
        –! Different style of modelling required
•! Users not going to query the data model
        –! Users will query separate dependent data marts
•! Data will be extracted from the model to
   populate the data marts by ETL tools
•! Data will be loaded into the model from the
   source systems by ETL tools
© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels            Page 7
David M Walker                                                              14 November 2009
                     Assumptions

•! Direct updates will be prohibited
        –! A separate application or applications will exist as a
           surrogate source and ETL used to load the data
•! Not a ‘mixed mode’ database
        –! Some parts using one data modelling convention and
           other parts using another
        –! This is bad practice with any modelling technique!




© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels            Page 8
David M Walker                                                              14 November 2009
                     Data Management & Warehousing



                                       REQUIREMENTS OF A
                                        DATA WAREHOUSE
                                          DATA MODEL




© 2008 Data Management & Warehousing       ETIS Community Gathering, Brussels            Page 9
David M Walker                                                                  14 November 2009
                     Requirements

•! Uses A Design Pattern
        –! General reusable approaches and solutions to
           commonly occurring problems that can be used in
           many different situations
•! Convention Over Configuration
        –! Decrease the number of decisions that designers /
           developers need to make, gaining simplicity, without
           losing flexibility
        –! Achieved by ensuring that tables and columns use a
           standard structures, naming convention, etc. and are
           populated and queried in a consistent fashion
© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 10
David M Walker                                                              14 November 2008
                     Requirements

•! DRY (Don’t Repeat Yourself)
        –! Reduce duplication because it:
                 •! Increases the difficulty of changing the model
                 •! Decreases the clarity of the model
                 •! Leads to opportunities for inconsistency
•! Static over a long period of time
        –! No need to add or modify tables on a regular basis
        –! Note: There is a difference between designed and
           implemented, it is possible to have designed a table
           but not to implement it until it is actually required

© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels            Page 11
David M Walker                                                              14 November 2008
                     Requirements

•! The data model should store data at the lowest
   possible level
        –! Information stored at the transaction level
        –! Avoid the storage of aggregates
•! Supports the best use of platform specific
   features without compromising the design
        –! Where available supports:
                 •! Partitioning
                 •! Column Storage
                 •! Many Insert/Few Update strategies

© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 12
David M Walker                                                              14 November 2009
                     Requirements

•! Completely time-variant
        –! It should be possible to reconstruct all information at
           any point in time
•! Communication tool
        –! Aids the refinement of requirements
        –! Aids the explanation of possibilities
        –! Develops confidence from the user




© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 13
David M Walker                                                              14 November 2009
                     Requirements

•! Uses Standard BI Relational Databases
        –! Ensure that the solution can be deployed on any
           current platform and, if necessary, re-deployed on a
           future platform
•! Process Neutral
        –! It will not reflect past, current or planned business
           processes, practices or dependencies
        –! Stores the data items and relationships as defined by
           their use at the point in time when the information is
           created and acquired

© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 14
David M Walker                                                              14 November 2008
                     Data Management & Warehousing



                                       CONSTRUCTING THE
                                       DATA WAREHOUSE
                                         DATA MODEL




© 2008 Data Management & Warehousing      ETIS Community Gathering, Brussels           Page 15
David M Walker                                                                 14 November 2009
                     Who is the customer?

                                                           •! Everyone has a
                                                              different definition
                                                           •! Everyone needs a
                                                              different information
                                                           •! Users have conflicting
                                                              definitions
                                                           •! Customer can be
                                                              individuals or
                                                              businesses

© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels             Page 16
David M Walker                                                                14 November 2009
                     More problems …

•! Some of the customers are suppliers as well
•! Some businesses have separate divisions that
   have to be handled separately
•! Some customers interact with different divisions
   within our organisation
•! Some individuals or organisations also perform
   other roles
        –! e.g. legal, re-sellers, partners, etc.


© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 17
David M Walker                                                              14 November 2009
                     The Party

•! These problem arises because the data is being
   looked at in terms of current business process
•! In fact there is no customer entity, just different
   types of party
        –! Individuals, Organisations, Organisational Units
        –! Concept of Party identical to that in contract law
•! The role of customer is defined not by the table
   definition but by the usage of party data with
   other information held (e.g. the purchase
   transaction relating to a product)
© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 18
David M Walker                                                              14 November 2009
                     Attributes of Party

•! The attributes of ‘Party’ will be those that remain
   static over the life of the record
        –! State ID Number, Name, Start Date, End Date
        –! These attributes have ‘lifetime value’
•! Attributes that change need to be stored
   elsewhere
•! The Party table needs to be categorised or typed
        –! Individual, Organisation, Organisation Unit


© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 19
David M Walker                                                              14 November 2009
                     PARTIES Data Model
                                                                       PARTIES
                                                                       •!PARTY_DWK
                                                                       •!PARTY_ID
                                                                       •!PARTY_NAME
                                                                       •!PARTY_START_DATE
 PARTY_TYPES                                                           •!PARTY_END_DATE
 •!PARTY_TYPE_DWK                                                      •!PARTY_TYPE_DWK
 •!PARTY_TYPE
 •!PARTY_TYPE_DESC
 •!PARTY_TYPE_GROUP
 •!PARTY_TYPE_START_DATE
 •!PARTY_TYPE_END_DATE




© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels                           Page 20
David M Walker                                                                              14 November 2009
                     Supporting Non-Lifetime Attributes

•! Need to add data for different Party Types
        –! Marital Status for Individuals
        –! Number of Children for Individuals
        –! Number of Employees for Organisations
        –! Turnover for Organisations
•! Need to add data that changes over the lifetime
   of the party
        –! Usually the same attributes that are needed for
           different Party Types

© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 21
David M Walker                                                              14 November 2009
                     PARTY_PROPERTIES Data Model

 PARTIES
 •!PARTY_DWK
 •!PARTY_ID
 •!PARTY_NAME
 •!PARTY_START_DATE                                           PARTY_PROPERTIES
 •!PARTY_END_DATE                                             •!PARTY_DWK
 •!PARTY_TYPE_DWK                                             •!PARTY_PROPERTY_TYPE_DWK
                                                              •!PARTY_START_DATE
                                                              •!PARTY_END_DATE
 PARTY_PROPERTY_TYPES                                         •!PARTY_PROPERTY_VALUE
 •!PARTY_PROPERTY_TYPE_DWK
 •!PARTY_PROPERTY_TYPE
 •!PARTY_PROPERTY_TYPE_DESC
 •!PARTY_PROPERTY_TYPE_GROUP
 •!PARTY_PROPERTY_TYPE_START_DATE
 •!PARTY_PROPERTY_TYPE_END_DATE




© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels                         Page 22
David M Walker                                                                            14 November 2009
                     Relationships between Parties

•! Parties have relationships
        –! David Walker works in Professional Services
        –! David Walker is employed by Data Management &
           Warehousing
        –! David Walker is married to Helen walker
•! This is known as a Peer-To-Peer relationship
•! This is the first place that we see a role defined
   by a relationship


© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 23
David M Walker                                                              14 November 2009
                     PARTY_LINKS Data Model

 PARTIES
 •!PARTY_DWK
 •!PARTY_ID
 •!PARTY_NAME
 •!PARTY_START_DATE
 •!PARTY_END_DATE                                 PARTY_LINKS
 •!PARTY_TYPE_DWK                                 •!PARTY_DWK
                                                  •!LINKED_PARTY_DWK
                                                  •!PARTY_LINK_TYPE_DWK
 PARTY_LINK_TYPES                                 •!PARTY_START_DATE
 •!PARTY_LINK_TYPE_DWK                            •!PARTY_END_DATE
 •!PARTY_LINK_TYPE
 •!PARTY_LINK_TYPE_DESC
 •!PARTY_LINK_TYPE_GROUP
 •!PARTY_LINK_TYPE_START_DATE
 •!PARTY_LINK_TYPE_END_DATE




© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 24
David M Walker                                                              14 November 2009
                     Segments of Parties

•! Grouping Parties together because at some
   point in time they shared characteristics
•! This is known as a Peer Group Relationship
•! Examples
        –! Married people with two or more children
        –! IT companies with less than <100 employees
•! Usually generated by analysis and the results
   stored
•! Most commonly seen in market segmentation
   type applications
© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 25
David M Walker                                                              14 November 2009
                     PARTY_SEGMENTS Data Model

 PARTIES
 •!PARTY_DWK
 •!PARTY_ID
 •!PARTY_NAME
 •!PARTY_START_DATE
 •!PARTY_END_DATE                                 PARTY_SEGMENTS
 •!PARTY_TYPE_DWK                                 •!PARTY_DWK
                                                  •!PARTY_SEGMENT_TYPE_DWK
                                                  •!PARTY_START_DATE
 PARTY_SEGMENT_TYPES                              •!PARTY_END_DATE
 •!PARTY_SEGMENT_TYPE_DWK
 •!PARTY_SEGMENT_TYPE
 •!PARTY_SEGMENT_DESC
 •!PARTY_SEGMENT_GROUP
 •!PARTY_SEGMENT_START_DATE
 •!PARTY_SEGMENT_END_DATE




© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels            Page 26
David M Walker                                                               14 November 2009
                     Understanding The Conventions

•! All Type tables have the same format
        –! Categorisation
•! All Property tables have the same format
        –! Time Variant Attributes
•! All Link tables have the same format
        –! Peer-To-Peer Relationships
•! All Segment tables have the same format
        –! Peer Group Relationships
•! There are no other significant clusters of data about a
   single entity such as Party

© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 27
David M Walker                                                              14 November 2009
                     Introducing Major Entities

•! Party is a Major Entity
        –! These are entities that exist regardless of the
           business process
        –! It is the relationships between major entities that are
           defined by business processes
        –! Major Entity attributes differ from one another
•! All Organisations only need a finite number of
   major entities including:
           –! Campaign           –! Asset                                   –! Geography
           –! Account            –! Channel                                 –! Product/Service
           –! Electronic Address –! Contract                                –! Calendar
© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels                           Page 28
David M Walker                                                                              14 November 2009
                     Data Models For Other Major Entities

•! Geography
        –! Geography Types
                 •! Postal Addresses, GPS Co-ordinates, ELR
        –! Geographic Property Types
        –! Geographic Properties
        –! Geographic Link Types
        –! Geographic Links
        –! Geographic Segment Types
        –! Geographic Segments
•! and so on for every major entity
© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 29
David M Walker                                                              14 November 2009
                              Major Entity Sub Model
     Major Entity Sub-Model
                                                      MAJOR ENTITY                 MAJOR ENTITY
                                                       PROPERTIES                 PROPERTY TYPES


                              MAJOR ENTITY


                                                      MAJOR ENTITY                 MAJOR ENTITY
                                                         LINKS                      LINK TYPES




                              MAJOR ENTITY            MAJOR ENTITY                 MAJOR ENTITY
                                 TYPES                 SEGMENTS                   SEGMENT TYPES



© 2008 Data Management & Warehousing         ETIS Community Gathering, Brussels                      Page 30
David M Walker                                                                               14 November 2009
                     Relationships Between Major Entities

•! Storing names with multiple addresses and
   multiple electronic addresses (e-mail, telephone
   numbers, etc.)
        –! Billing, Contact, Home, Work, etc
•! Usage
        –! Party -> Contract -> Account -> Electronic Address ->
           A Number -> Usage
        –! Party -> Contract -> Account -> Electronic Address ->
           B Number -> Usage
        –! Product/Service -> Tariff - Usage
© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 31
David M Walker                                                              14 November 2009
                     Party -> (Electronic) Addresses
                                  PARTY_
                                  ADDRESS_
                                  HISTORY_
                                  TYPES


                                             PARTY_                                  ADDRESS
                                             ADDRESS_
                                             HISTORY

           PARTY




                                             PARTY_                                 ELECTRONIC
                                             ELECTRONIC_ADDRESS_                     ADDRESS
                                             HISTORY

                              PARTY_
                              ELECTRONIC_ADDRESS_
                              HISTORY_
                              TYPES
© 2008 Data Management & Warehousing           ETIS Community Gathering, Brussels                 Page 32
David M Walker                                                                            14 November 2009
                     Party -> Usage (Simplified)

              PARTY                               CONTRACT                                      ACCOUNT




PRODUCT SERVICE                                    PRODUCT                                ACCOUNT
TARIFF                                             SERVICE                                ELECTRONIC ADDRESS
HISTORY & TYPE                                                                            HISTORY & TYPE




                                                                               A Number
              TARIFF                                                                           ELECTRONIC
                                       USAGE HISTORY                                            ADDRESS
                                                                               B Number




© 2008 Data Management & Warehousing      ETIS Community Gathering, Brussels                                Page 33
David M Walker                                                                                      14 November 2009
                     Extending the Data Model

•! Identify as many Major Entities as possible
        –! But remember there are only a finite number so don’t
           invent things for the sake of it
•! Define the standard sub-model around them
•! Put appropriate data in the sub-model
•! Create the relationships to _HISTORY tables for
   the transaction the business wants to analyse



© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 34
David M Walker                                                              14 November 2009
                     Does this help meet requirements?

!! Uses A Design Pattern
!! Convention Over Configuration
!! DRY (Don’t Repeat Yourself)
!! Static over a long period of time
!! The data model should store data at the lowest possible
   level
!! Supports the best use of platform specific features
   without compromising the design
!! Completely time-variant
!! Communication tool

© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 35
David M Walker                                                              14 November 2009
                     Some Key Elements

•! Self Similar modelling
        –!   All _TYPE tables have the same structure, etc.
        –!   Naming conventions are consistent everywhere
        –!   Easy to create standard algorithms for load and extraction
        –!   Easy to partition on type and/or date
•! Insert ‘heavy’ / Update ‘light’
        –! Most ETL will result in an insert, there will be very few updates
•! Manages ‘Slowly Changing Dimensions’
        –! Inherent in the Major Entity Sub-Model design
        –! Significantly reduces overhead in the Data Mart build
•! Data Driven
        –! Types provide extensible metadata
        –! Prevents un-necessary updating of the data model itself
•! Natural Star Schemas
        –! Histories will map to FACTS,
        –! Major Entity Collections will collapse into DIMENSIONS
© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels              Page 36
David M Walker                                                                 14 November 2008
                     Is this all there is to it ?

•! At a high level – YES
•! BUT:
        –! There are methods for dealing with data quality
        –! Special case methods for some lifetime attributes
                 •! e.g. Handling women changing their names at marriage
        –! Insert/Update methods for performance
        –! Design Patterns for implementation
        –! Other detailed techniques
•! This talk could only ever be:
                           “An introduction to
                     Process Neutral Data Modelling”
© 2008 Data Management & Warehousing   ETIS Community Gathering, Brussels           Page 37
David M Walker                                                              14 November 2008
                     Further Reading

•! Available From http://www.datamgmt.com
•! White Papers
        –! Overview Architecture for Enterprise Data Warehouses
                 •! March 2006 - 32 pages
        –! Data Warehouse Documentation Roadmap
                 •! April 2007 – 28 pages
        –! How Data Works
                 •! June 2007 – 32 Pages
        –! Data Warehouse Governance
                 •! April 2007 – 24 Pages
        –! Data Warehouse Project Management
                 •! October 2008 – 32 Pages
        –! Process Neutral Data Modelling

© 2008 Data Management & Warehousing          ETIS Community Gathering, Brussels           Page 38
David M Walker                                                                     14 November 2009
                     Data Management & Warehousing


                                       Thank you !!

        Website:   http://www.datamgmt.com
        Phone:     +44 7050 028 911
        E-mail:    davidw@datamgmt.com
        Skype/MSN: datamgmt

© 2008 Data Management & Warehousing      ETIS Community Gathering, Brussels           Page 39
David M Walker                                                                 14 November 2008

				
DOCUMENT INFO
Shared By:
Tags:
Stats:
views:26
posted:4/24/2012
language:
pages:39