Phooey on GUI! Using HPL from the command line

Document Sample
Phooey on GUI! Using HPL from the command line Powered By Docstoc
					Informix User Forum 2005         Moving Forward With Informix




               Phooey on GUI!
   Using HPL from the command line

                          Jerry Hamilton
     Director of Database Administration
            Fleishman-Hillard, Inc



       Atlanta, Georgia                 December 8-9, 2005
 Using HPL from the command line

             •General introductions
             •All about the tool
             •Speed and flexibility
             •The Onpload database
             •The Onpload utility
             •Simple Script and the Onpload utility
             •Management of HPL jobs
             •Logs, messages and helpful how-to’s
December 8,9 2005                                     2
General Introductions

•I’ve been using Informix products since 1992
•Active member in the SLAIUG
•Former IIUG Board member
•Former President of the IIUG
•Avid HPL user
•Old, grey hair, guy who lives on the “command line”




December 8,9 2005                                      3
Old guy




December 8,9 2005   4
  All about the tool

  •Overview
  •Architecture
  •Components




December 8,9 2005      5
Overview

• Supports COBOL, ASCII, multibyte, delimited or binary
  data
• IPLOAD provides a generate option
• Can unload/load data from different locales
• Support for unloading data with a query using a Select
  statement
• Supports loading raw tables in express mode




December 8,9 2005                                          6
Architecture

•   Load straight from or to disk
•   Prepare data to a pipe and Load
•   Unload/Load to gzip
•   Unload to another HPL load process




December 8,9 2005                        7
                          File                  Pipe




                              Onpload Utility

           IPLOAD




                              Server




                    Onpload             Target



December 8,9 2005                                      8
Components

• HPL consists of the onpload utility, ipload and the onpload database
    – Onpload
         •   Converts, filters and moves data between a database and a storage device
         •   Uses information from the onpload database to run load and unload jobs
         •   Flags information about data during a load
         •   Onpload can be used from the command line or from ipload
    – Ipload
         • Creates and manages the onpload database
         • Creates and stores information into the database
         • Vehicle used to manage the onpload components
    – Onpload database
         • Holds information about load and unload jobs




December 8,9 2005                                                                       9
 Speed and Flexibility

  •What other data mover tools are available
       •Dbload, “unload to”, dbexport
  •General HPL uses
       •Backup tool, Reorg tool, data movement tool
  •Just how fast is it?
       •Table – PS_FH_MISC_TRANS
            •Size: 463 bytes
            •Columns: 27
            •Rows: 2,595,394
            •10mins 32sec to “unload to..”
            •4mins 31secs to HPL!

December 8,9 2005                                     10
Speed and Flexibility – cont.

• Other really fast examples
    –   Baseline speed of HPL is 2GB/CPU/HR
    –   Some users have seen 5 – 10GB!
    –   How big are the rows
    –   What kind of machine
    –   Using Light Scans




December 8,9 2005                             11
 The Onpload Database

  • Starting HPL and what occurs under the covers
       • Environment variables
            • “normal” Informix – INFORMIXDIR, etc
            • $DISPLAY
       • At the command prompt enter, “ipload”
  • Phooey on GUI!
       • Your desktop can get busy!
       • Just watch….
December 8,9 2005                                    12
IPload




December 8,9 2005   13
Unload Job




December 8,9 2005   14
Load Job




December 8,9 2005   15
Busy Desktop




December 8,9 2005   16
Busy Desktop




December 8,9 2005   17
Busy Desktop




December 8,9 2005   18
Busy Desktop




December 8,9 2005   19
Busy Desktop




December 8,9 2005   20
The Onpload database

• The major tables of onpload database
    –   Session - Holds the parameters that onpload uses to invoke a job
    –   Device - Holds information about the device array
    –   Formats - Holds basic information for a record format
    –   Formatitem – Defines the data/file records
    –   Maps – Defines record-to-table mapping
    –   Mapitem – Defines the relationship between columns and records
    –   Others




December 8,9 2005                                                          21
The Onpload utility

 •Clicky-clinky-Clack! No more!
 •Command line switches
 •Some quick examples




December 8,9 2005                 22
Simple Script
 Usage: clhpl.sh
  {-fu | -fl} -j jobname -D dbname -t tablename {-d file | -f filename}
  [-l logfile] [-R rejectfile] [-e maxerrors] [-debug]

   -fu           Create an unload job
   -fl           Create a load job
   -j jobname    Name of the job to create
   -D dbname     Database name
   -t tablename  Table to load/unload
   -d file       File to use for load/unload
   -f filename   A file containing a list of files to use for the load/unload.
                 The format is one file per line
   -l logfile    Pload log file
   -R rejectfile Pload reject file
   -e maxerrors Maximum errors before pload stops
   -debug        Enable script debugging output

 Create a pload load/unload job by populating the onpload database.

December 8,9 2005                                                                23
Simple Script
Simple shell script to populate the onpload database and create load/unload
jobs without using the ipload gui. Quite a few assumptions are made:

    1. All loading is done from ASCII files.

    2. Files have the usual pipe delimited format.

    3. All columns in the table are used and the file format matches the
       order of the columns in the table. (No mapping)

    4. The map, format, and device names are all the same as the job name.

    5. No attempt is made to share the device array between a load and unload
       job for the same table.

    6. While some error checking is performed, none is attempted for the
       inserts into the onpload database.


December 8,9 2005                                                               24
Simple Script

• CLHPL.sh does:
    – Inserts data into device table

    – Inserts data into formatitem table

    – Inserts data into formats table

    – Inserts data into mapitem table

    – Inserts data into maps table

    – Inserts data into query table

    – Inserts data into session table

• Basically everything you need to run an HPL job
December 8,9 2005                                   25
Populate Onpload database
•   ###############################################################
•   # populate_hpldb
•   # Create all unload jobs in onpload database
•   #
•   ###############################################################
•   populate_hpldb()
•   {
•   while read tablename num_of_files
•        do
•        jobname=$tablename
•        unloadfilenames=$IPLOADDIR/ipload_filelists/$tablename
•        logfile=$IPLOADDIR/logs/$tablename.log
•        rlogfile=$IPLOADDIR/rlogs/$tablename.rlog

•       ./clhpl.sh -fu -j $jobname -D $DATABASE -t $tablename -f $unloadfilenames -l $logfile -R rlogfile -e 0

•       done < $IPLOADDIR/$INPUTFILE
•   }




December 8,9 2005                                                                                                26
Run the jobs!
•  ###############################################################
•  # run_unloads
•  # Run all unload jobs in batches based on $NUMTREADS
•  #
•  ###############################################################
•  run_unloads()
•  {
•  integer jobcount=0
•  while read jobname num_of_files
•       do
•       jobcount=`jobs|wc -l`
•       while [[ $jobcount -ge $NUMTREADS ]]
•             do
•             sleep 1
•             jobcount=`jobs|wc -l`
•             done
•       echo "starting unload job for $jobname"
•       onpload -j $jobname -fu > /tmp/onpunloads.out 2>&1 &
•       done < $IPLOADDIR/$INPUTFILE
•  echo "waiting for all current running unloads to complete"
•  wait
•  }
December 8,9 2005                                                    27
Management of HPL jobs

 •Modes of operation
 •Violations
 •Performance considerations




December 8,9 2005              28
Modes of operation

• Deluxe Mode
   – Referential and constraint checking
   – Load while users are working
   – Users can access during a load
   – Updates indexes
   – Evaluates triggers
• Limitations:
   – No loads without conversion



December 8,9 2005                          29
Modes of operation

• Express Mode
   – Locks tables for exclusive mode during loads
   – Disables referential and constraint checking
   – Requires a level-0 backup after load ==
   – Supports loading of raw tables
   – Reenables all objects after loading
• Limitations (a lot):
   – Logging mode or ANSI mode
   – Tables that contain smart large objects
   – Tables that contain simple large objects
   – Rows larger than the system page size
   – more

December 8,9 2005                                   30
Violations

• Records that don’t meet the load criteria
    – Null values where tables specifies NOT NULL
    – Char’s in numeric fields
    – Records that do not match number of columns
• Rejected record from the input file
    – Filter rejects
    – Records that can’t be converted
• Constraint violations




December 8,9 2005                                   31
Performance considerations

•   Configuration-parameter values
•   Mode
•   Devices
•   Usage models




December 8,9 2005                    32
Logs, messages and How-to’s
•   Mon Dec 5 13:39:09 2005
•   SHMBASE         0x76a4c000
•   CLIENTNUM         0x49010000
•   Session ID 2
•   Unload Database -> fsqa8
•   Query Name       -> misc_trans
•   Device Array -> misc_trans
•   Query Mapping -> misc_trans
•   Query        -> select * from psoft8.ps_fh_misc_trans for read only
•   Convert Reject -> /tmp/misc_trans.rej
•   13:39:11 Records Processed -> 10000
•   13:39:12 Records Processed -> 20000
•   13:39:13 Records Processed -> 30000
•   13:39:14 Records Processed -> 40000

•   …many more lines

•   13:44:00 Records Processed ->    2560000
•   13:44:01 Records Processed ->    2570000
•   13:44:02 Records Processed ->    2580000
•   13:44:03 Records Processed ->    2590000

•   Database Unload Completed -- Unloaded 2595394 Records Detected 0 Errors
•   Mon Dec 5 13:44:05 2005


•   Reject log is empty


December 8,9 2005                                                             33
From our experience..

• Leave no stranded onpload threads behinds!

bob   7498   13   0 14:02:15 ?   0:00 /opt/ifmx/bin/onpload -Shp_fs_qa8_tcp -r4 -fb
bob   7496   13   0 14:02:15 ?   0:00 /opt/ifmx/bin/onpload -Shp_fs_qa8_tcp -r4 -fb
bob   7522   13   0 14:02:18 ?   0:00 /opt/ifmx/bin/onpload -Shp_fs_qa8_tcp -r4 -fb
bob   7497   13   0 14:02:15 ?   0:00 /opt/ifmx/bin/onpload -Shp_fs_qa8_tcp -r4 -fb
bob   7523   13   0 14:02:18 ?   0:00 /opt/ifmx/bin/onpload -Shp_fs_qa8_tcp -r4 -fb
bob   7499   13   0 14:02:16 ?   0:00 /opt/ifmx/bin/onpload -Shp_fs_qa8_tcp -r4 -fb
bob     13    1   0 14:02:15 ?   0:00 /opt/ifmx/bin/onpload -Shp_fs_qa8_tcp -r4 -fb
bob   7519   13   0 14:02:18 ?   0:00 /opt/ifmx/bin/onpload -Shp_fs_qa8_tcp -r4 –fb

•     It’s best to start with a clean onpload database
•     Watch nfiles (HP-UX) and memory
•     Don’t forget that level-0 on express loads!
•     Reindex outside of HPL
December 8,9 2005                                                                     34
References and other examples

• Search the IIUG Software repository for “onpload”
    – Several flavors of command line HPL like myonpload, ESQL/C
      examples and “create_pload.sh”


• ONPLADM
    – IBM.COM Developerworks for techarticle on onpladm




December 8,9 2005                                                  35
Informix User Forum 2005          Moving Forward With Informix




                          Questions?

                          Jerry Hamilton
       Director of Database Administration
              Fleishman-Hillard, Inc




       Atlanta, Georgia                    December 8-9, 2005
Informix User Forum 2005          Moving Forward With Informix




               Phooey on GUI!
   Using HPL from the command line
                          Jerry Hamilton
       Director of Database Administration
              Fleishman-Hillard, Inc




       Atlanta, Georgia                    December 8-9, 2005