Dear_ John Doe

Document Sample
Dear_ John Doe Powered By Docstoc
					White Paper: DB2 to Oracle Migration
MacDB2O 2006: Version – February 1, 2005

1. Overview
This document reviews Macrosoft’s process methodology for migrating IBM Mainframe DB2 databases to
Oracle. Much of our work in this area has been done in conjunction with work projects intended to migrate
mainframe applications to server environments (Unix, Linux, Windows). We have our own migration tools
(MMK – Mainframe Migration Toolkit), as well as expertise in using industry standard tools such as Oracle
Workbench, Micro Focus Revolve, etc.

We are development partners of IBM and Oracle, and as well are partners of: Micro Focus, Migration
Transformation Consortium (MTC) for legacy migrations, and Mainframe Migration Alliance (MMA).

The methodology we use requires minimum interaction with the client’s mainframe production
environment, thereby reducing costs and minimizing interruptions of the production system. This is
achieved through a phased activity, in which most of the analysis, tool-creation, mock-conversion, testing
work is done on a PC (back office). Our global delivery model (offshore, near-shore, and onsite) further
minimizes costs for the client.

2. Process Phases
The migration process generally includes the following steps:

         1.   Creation of overall Project Plan
         2.   Creation of overall Migration Plan
         3.   Database Schema Migration
         4.   Data Migration

These are explained below:

2.1 Creation of Overall Project Plan
This involves the following activities:

        Determine migration scenario
        Identify all migration tasks
        Develop infrastructure sustaining processes
        Determine the training requirements
        Determine resources that will be required (HW,SW and people)
        Determine QA & Support processes and teams
        Identify customer responsibilities
        Determine MVS change management process
        Determine the environments -- Test, Development, Production
        Determine the network connectivity (VPN) to both source and target systems
        Determine the amount of data to migrate
        Determine the amount of customization for online and batch
        Archiving of historical data to reduce the amount of data to migrate
       Determine the parallel testing strategy
       Determine details of integration with other systems
       Determine naming conventions and process standards
        [eg: Oracle database object names can be 30 chars long whereas in DB2 it is 18 chars ]
       Prepare a proto target environment
       Test the migration scenario

2.2 Creation of Overall Migration Plan
This involves the following activities:

       Analyze customization and platform differences
       Develop an infrastructure plan
       Design the layout of the databases, table-spaces, and tables
       Estimate CPU and Storage (disk & memory) sizes
       Analyze and choose migration tools
       Analyze the fixes for present and future environments
       Develop plans for:
                     Print migration
                     Batch migration
                     Testing (Test Criteria & Strategy )
       Determine customization of source applications required
       Determine conversion of JCLs etc. to shell scripts required
       Determine conversion/Customization of third party tools
       Determine EBCDIC-ASCII conversion - possible issues
       Develop plans for
                     Security
                     Backup and disaster recovery
                     Administration
                     Database monitoring
                     Performance monitoring
                     Change management
                     Stress test
                     Incident tracking
                     Vulnerability assessment

2.3 Database Schema Migration
The phases involved in DB schema migration are shown in Fig. 1 and explained below:

2.3.1 Extract Database Meta data
This phase involves extracting and analyzing the source database structure, from the source DDL
statements. Modeling tools and reverse engineering can help in capturing all details of the schema.

2.3.2 Convert Database Objects
This is the major step of the schema migration process. All database objects in the source database need
to be converted to the equivalent objects in the target system. Typically objects such as data types,
tables, columns, views, indexes, stored procedures, triggers, packages, sequences, authorities, functions
etc. need to be converted. Factors such as data type, scale, precision, length and default values for table
columns, functions, and stored procedures, null values etc. can cause issues. Refer Section 4 for
examples of terminology differences between DB2 and Oracle.
2.3.3 Convert Queries
This is the next major phase in the database schema migration process. Even though the basic SQL
commands are the same, SQLs differs from engine to engine (Refer to Section 4 for examples of
differences between Oracle PL/SQL and DB2 SQL). SQL translation requires good expertise and
knowledge of both the source and target systems in order to avoid performance issues.

2.3.4 Implement Converted Objects
This phase involves building the database structure, on the target platform through scripts or the facilities
provided in the target system. Enhancements related to the schema or performance can also be
considered in this phase, utilizing the special features in the target system

2.4 Data Migration
The phases involved in data migration are shown in Fig. 2 and explained below:

2.4.1 Data Analysis
This phase involves walkthrough of the data presently in the database (or in use). Some data, which is
well accommodated in the source system, may not be accommodated in the target system. Usually the
volume of data is large and a full walkthrough may not be possible. In such cases random samples are
taken for identifying data items, which can cause problems in movement.
2.4.2 Data Cleanup / Enrichment
A data cleanup / enrichment prior to migration can help in effective movement of data. There could be
obsolete or unused items, as well as items which will not affect the source or target system if modified. If
this step is performed well in advance, the subsequent phases in this process will gain significant

2.4.3 Conversion Study
This phase involves assimilation of the outputs from the above two phases, and detailed study for
finalization of a conversion strategy. This phase can be categorized into the following steps:

       Fitment / Conversion Study – Output of this phase is a study report detailing the changes
        required in the data items for the movement.

       Formation of Migration Strategy – Output of this phase is the “Migration Strategy Document”
        detailing the planned process of migration, tools planned to be used etc.

       Finalization of Scope of Migration - In this phase, the scope of migration is defined. Items such
        as scope, limitations, performance and maintenance issues etc. need to be well defined.

       Finalization of Acceptance Criteria – This phase will define the acceptance / test criteria, test
        process and test procedures to ensure that the data movement is fault free. Conversion Strategy Signoff
In this phase, user (client) approves all the documents mentioned above. This phase is very crucial, while
handling critical data.

2.4.4 Conversion Tool Preparation
In this phase, the tools required for the data movement are developed (or customized). In production
systems, tools are very crucial since the final data movement is done in one shot (usually in 1 or 2 days
during off hours or holidays). The tools preparation is a full project activity of its own involving all phases
of SDLC.

2.4.5 Mock Conversion
In this phase, a mock conversion is performed, using the existing data in the source system. This may
involve several rounds as below:

               Mock Conversion Round 1
               Fixing of mismatches observed in round 1
               Mock Conversion Round 2
               Fixing of mismatches observed in round 2
It is very important to document the change records during this phase.

2.4.6 Conversion for Parallel Run
Usually a pre-production system is setup for parallel run to which the data migration can be performed to
ensure that the migration is problem-free. In this phase a one-shot data migration from the source system
to the pre-production system is performed. Detailed testing is carried out to ensure that the data migration
is fault free. Detailed performance testing and monitoring is also done in this phase.

2.4.7 Conversion for Live System
This is the final step of actual data movement from source system to target system. In production
systems, this should be done in one shot when the system is not active (usually off hours or holidays). In
24x7 systems, the system may have to be brought down to off-line mode for the data movement.

3. General Milestones and Deliverables
Generally we envisage documents that include the following:

                Overall Project Plan Document
                Overall Migration Plan Document
                System Overview document
                Plan – Sign Off

Analysis & Design
            Gap Analysis document
            SRS - Migration Specification document
            Data Migration Strategy document
            Migration Tools (Design/Test/Usage) documents
            User Acceptance Test Plan (UAT) document
            Test Criteria, Plan & Procedures
            Migration Strategy & Tools – Sign Off
            Design – Sign Off

Schema Migration
          Schema Migration Reports
          Unit Test Reports (Schema Validation)
          Schema Migration – Sign Off
Data Migration
            Mockup Data Migration Reports (including cleanup details)
            Unit Test Reports
            Data Migration – Sign Off

User Acceptance Test
          Test Reports
          UAT – Sign Off

 Installation/Live Run
             Live Migration Test Reports
             Live Cutover – Sign Off

 Delivery & Post implementation Support
            Parallel Run Reports
            Tools & Application Software developed
            All other documents
            Project – Sign Off

4. Examples of Differences between DB2 and Oracle

    Terminology    DB2                                      Oracle

    Database       A subsystem can have more than one       Each instance has one database
                   database. Databases are used to          and one set of system catalog
                   logically group application data. All    tables.
                   databases share the same system
                   catalogs, system parameters, and
                   processes in the subsystem. DBADM
                   authority is granted on the database
                   level. SYSADM authority is granted at
                   the subsystem level.
    Tablespace     A database is logically divided into     A database is logically divided into
                   tablespaces. There are several           tablespaces. A tablespace can point
                   tablespace types: simple, segmented,     to one or more physical database
                   partitioned and large partitioned (for   files on disk. One or more tables
                   16 TB tables). A non-partitioned         can reside in a tablespace
                   tablespace points to one physical
                   VSAM file on DASD. A partitioned
                   tablespace points to one VSAM file
                   per partition on DASD. A segmented
                   or simple tablespace can contain one
                   or more tables.
    Blocks         Equivalent to pages; 4 K, 8 K, 16 K,     The smallest unit of database
                   32 K.                                    storage. Database files are
                                                            formatted into blocks, which can be
                                                            from 2 K to 16 K.
    Extents        The unit by which storage is allocated   The unit by which storage is
                   for a VSAM file. The size of the         allocated in a database file. The
                   primary and secondary extents is         size of the primary and secondary
                   specified in the CREATE                  extents are specified in the Storage
             TABLESPACE statement. A VSAM            clause of the CREATE TABLE or
             file can grow up to a maximum of 119    CREATE INDEX statements or
             secondary extents. Extents are made     default to the sizes specified in the
             up of contiguous pages.                 CREATE TABLESPACE statement.
                                                     Extents are allocated until there is
                                                     more free space in the files that
                                                     make up the tablespace, or the
                                                     maximum number of extents has
                                                     been reached. The size of
                                                     the file is specified in the CREATE
                                                     TABLESPACE statement. Extents
                                                     are made up of contiguous blocks of
Stogroups    A series of DASD volumes assigned a     No equivalent.
             unique name and used to allocate
             VSAM datasets for DB2 tablespaces
             and indexes.
Stored       Stored procedures are written in C,     Written in PL*SQL, JAVA etc.
Procedures   C++, COBOL, Assembler, PL/1or the       Stored procedures are stored in an
             new DB2 SQL Stored Procedure            Oracle table and executed from
             language. The compiled host             within the database.
             language is stored on the DB2 server
             and the compiled SQL is stored on
             the database.
Plan         A plan is an executable module of       No equivalent.
             SQL that is composed of one or more
             packages and was created from a
             DBRM. A DBRM is a module of un-
             compiled SQL statements that were
             extracted from the source program by
             pre-compilation. A DBRM is bound
             into a plan or a package.
Clusters     No equivalent.                          Clusters are an optional method of
                                                     storing data. This approach creates
                                                     an indexed cluster for groups of
                                                     tables frequently joined. Each value
                                                     for the cluster index is stored only
                                                     once. The rows of a table that
                                                     contain the clustered key value are
                                                     physically stored together on disk.
Clustering   An index created on a column of a       No equivalent.
Index        table where the data values are
             stored in the same physical sequence
             as the index.
             Allows for fast sequential access.
Secondary    Secondary Authid or RACF Group.         No direct equivalent in Oracle.
Authid       Privileges can be granted to a          Groups of privileges known as roles
             secondary authid. Primary authids are   can be granted to a user ID.
             assigned to the secondary authid
             Group. Primary authids inherit all
                     privileges granted to the secondary
                     authid (group) they are in.
     Package         A package consists of a single          No equivalent as known in Oracle. A
                     program of executable SQL and the       “package” in Oracle has another
                     access paths to that SQL. The           meaning. Package is written in
                     package is stored on the database       PL*SQL and allows you to group all
                     and invoked by the host language        related programming such as stored
                     executable. A package is created by     procedures, functions, and variables
                     doing a BIND. A package may be part     in one database object that can be
                     of a PLAN.                              shared by applications.
     Other           PRIQTY                                  INITIAL
     examples of     SECQTY                                  NEXT
     differences     Smallint\Decimal                        NUMBER
                     FREEPAGE                                FREELIST
                     etc.                                    etc.

5. Database Migration – General Questionnaire (Example)

A. Customer Data
Customer Name:___________________________________ Phone no.:____________________
Contact Person: ____________________________________Fax no.: ______________________

B. Technical Data

B.1 Source System
Hardware Model ___________________           Operating System Name _____________________
Operating System Version ____________        Database DB2       Database Version ___________
Size of Production Database ____________ No. of Concurrent Users in Production ____________
Avg. No. of On-line Transactions in Production per hour __________________________________
No. of Batch Processes _____________
(a ) Production No. of Databases ________ No. of Tablespaces ________ Size _______
(b ) Test No. of Databases ________ No. of Tablespaces ________ Size _______
(c) Development No. of Databases ________ No. of Tablespaces ________ Size _______
(d) Other ___________ No. of Databases ________ No. of Tablespaces ________ Size _______
List Names and Number of rows for the 10 largest tables in Production:
__________________ ____________
__________________ ____________
__________________ ____________
__________________ ____________
__________________ ____________
__________________ ____________
Have any stored procedures been written to access the database? ( ) Yes ( ) No how many _________
What is the average number of SQL calls per stored procedure? ____________
How long are the stored procedures (total number of statements) ____________
Have any triggers been written to access the database? ( ) Yes ( ) No how many ________
What is the average number of SQL statements per trigger _____________
How long are the triggers (total number of statements)? _____________
Is an archival process in place ( ) Yes ( ) No
Brief description of Hardware Configuration _____________________________________________

Brief description of application, third party tools & Host languages __________________________


B.2 Target System
Hardware __________________________              Operating System Name ____________________
Operating System Version ____________            Database: Oracle   Database Version _________
Do Migration Tools such as “Oracle Migration Workbench” Exist ? ( ) Yes ( ) No

Brief description of Hardware Configuration _____________________________________________

Brief description of application, third party tools & Host languages __________________________