Datastage Architect Resume Company A Generalized Lesson in LOGO ETL Architecture Presented by

Document Sample
Datastage Architect Resume Company A Generalized Lesson in LOGO ETL Architecture Presented by Powered By Docstoc
					Company   A Generalized Lesson in
LOGO
          ETL Architecture




                Presented by Wes Dumey
            Durable Impact Consulting, Inc.
                            June 11, 2007
    Agenda


1. ETL Overview

2. When is ETL Appropriate?

3. Tools vs. Hard Coding

4. ETL Architecture
• ETL Overview
  – 20 Minutes
• ETL Design Tips
  – 20 Minutes
• Demonstration
  – 20 Minutes


• Ask questions at any time
            Speaker Biography
• Senior Consultant, Durable Impact Consulting,
  Inc.
• Experience on high-performance data
  warehouses
• Education
   – B.S. in Computer Information Systems
      • Missouri State University
   – M.A. in Business Economics (in progress)
      • University of South Florida
• External interests: aviation (Private pilot) and
  economics
             ETL Overview
• Extract Transform and Load is used to populate a data
  warehouse
• Extract is where data is pulled from source systems
   – SQL connect over networks
   – Flat files
   – Transaction messaging (MSMQ)
• Transformations can be the most complex part of data
  warehousing
   – Convert text to numbers
   – Apply business logic in this stage
• Load is where data is loaded into the data warehouse
   – Sequential or bulk loading
           ETL?

• Many companies find value in the
  graphical representation of data and use it
  in other applications as well
• ETL is very efficient when designed
  correctly
          ETL Tools vs. Hard Coding

• Many shops still use hard code (triggers,
  procedures, code blocks)
• Hard to maintain code
• Hard to scale properly
• ETL tools easy to visualize flows
• With SSIS, there is no good reason to not
  use an ETL tool
What is going on here?
            ETL Design Methodology

• Steps for successful ETL Design
  1.   Clear and concise requirements
  2.   Modularized design
  3.   Data cleansing capability
  4.   High Emphasis on Data Quality
  5.   Functional Testing
  6.   Sufficient Documentation
• See the methodology document
                ETL Methodology Steps

1.   Extract the data – pulls data
2.   Load PSA and audit tables
3.   Source Load Temp – sources and cleanses data
4.   Lookup Dimensions – extract records for update
5.   Lookup Facts
6.   Transform Facts
7.   Transform Dimensions
8.   Quality Check
9.   Load
                  Design Considerations
•   Naming conventions and comments
•   Standard approaches allow for:
     – Quick, micro-batch processing (if desired)
     – Ability to pause/resume, resurrection
•   Data cleansing
•   Legal requirements (HIPAA, SOX)
•   Industry-standard best practices
•   Data retention
     – Archive vs Purge
•   Quality
Demonstration
            Let’s Get Started

•   Gather Functional Requirements
•   Build the Data Model
•   Write Technical Specification
•   Construct
•   Test

Follows the systems development lifecycle
IECT
               Closing Info
• Presenter Information
• Blog
   – www.thedamndata.com “A techies’ discussion of databases,
     datawarehouses, and the damn data itself”
   – Covering SQL Server 2005, Oracle, IBM Websphere DataStage
     ETL tool, SSIS, and whatever the hell else is on my mind
   – Check it out – funny and hopefully informative
• Corporate Information
   – www.durableimpact.com – Durable Impact Consulting
• Presenting finalized EDW at Tampa Code Camp
                 Cycle Diagram
Add Your Text

                          Text
                                           Text




         Text          Cycle Name



                                    Text

                Text
      Progress Diagram

Phase 1    Phase 2   Phase 3
       Block Diagram


TEXT      TEXT   TEXT    TEXT




TEXT      TEXT    TEXT    TEXT
          Table

          TEXT   TEXT   TEXT   TEXT   TEXT

Title A

Title B

Title C

Title D

Title E

Title F
  3-D Pie Chart


          TEXT

                  TEXT
 TEXT


                  TEXT

TEXT
        TEXT
       Marketing Diagram

                Title




TEXT     TEXT           TEXT   TEXT

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:288
posted:1/19/2011
language:English
pages:20
Description: Datastage Architect Resume document sample