Docstoc

Teradata Utilities MultiLoad

Document Sample
Teradata Utilities MultiLoad Powered By Docstoc
					Teradata Utilities: MultiLoad

Reprinted for KV Satish Kumar, IBM
kvskumar@in.ibm.com

Reprinted with permission as a subscription benefit of Books24x7,
http://www.books24x7.com/
                                                                                                                                                i


                                              Table of Contents
Chapter 4: Multiload........................................................................................................................1
      Why it is Called "Multi" Load..................................................................................................1
          Two MultiLoad Modes: IMPORT and DELETE................................................................1
          Block and Tackle Approach.............................................................................................2
          MultiLoad Imposes Limits         .................................................................................................3
      Error Tables, Work Tables and Log Tables...........................................................................3
      Supported Input Formats.......................................................................................................4
      MultiLoad Has Five IMPORT Phases....................................................................................5
          Phase 1: Preliminary Phase           .............................................................................................5
          Phase 2: DML Transaction Phase...................................................................................6
          Phase 3: Acquisition Phase.............................................................................................6
          Phase 4: Application Phase.............................................................................................7
          Phase 5: Clean Up Phase           ................................................................................................7
      MultiLoad Commands............................................................................................................8
          Two Types of Commands................................................................................................8
      Parameters for .BEGIN IMPORT MLOAD.............................................................................9
      Parameters for .BEGIN DELETE MLOAD...........................................................................12
      A Simple Multiload IMPORT Script......................................................................................12
      Building our Multiload Script................................................................................................13
      Executing Multiload..............................................................................................................14
                                                               .
      Another Simple MultiLoad IMPORT Script ..........................................................................15
      MultiLoad IMPORT Script....................................................................................................18
      Error Treatment Options for the .DML LABEL Command                              ....................................................19
      An IMPORT Script with Error Treatment Options................................................................21
      A IMPORT Script that Uses Two Input Data Files...............................................................22
      Redefining the INPUT..........................................................................................................24
      A Script that Uses Redefining the Input...............................................................................24
      DELETE MLOAD Script Using a Hard Coded Value...........................................................26
      A DELETE MLOAD Script Using a Variable........................................................................27
                                        .
      An UPSERT Sample Script.................................................................................................28
      What Happens when MultiLoad Finishes                   .............................................................................29
          MultiLoad Statistics........................................................................................................29
      Troubleshooting Multiload Errors.........................................................................................30
      RESTARTing Multiload........................................................................................................31
      RELEASE MLOAD: When You DON'T Want to Restart MultiLoad.....................................31
      MultiLoad and INMODs        ........................................................................................................32
      How Multiload Compares with FastLoad.............................................................................32
 Chapter 4: Multiload
         "In the end we'll remember not the sound of our enemies, but the silence of our
         friends."
         - Martin Luther King Jr.


Why it is Called "Multi" Load
If we were going to be stranded on an island with a Teradata Data Warehouse and we could only
take along one Teradata load utility, clearly, MultiLoad would be our choice. MultiLoad has the
capability to load multiple tables at one time from either a LAN or Channel environment. This is in
stark contrast to its fleet-footed cousin, FastLoad, which can only load one table at a time. And it
gets better, yet!

This feature rich utility can perform multiple types of DML tasks, including INSERT, UPDATE,
DELETE and UPSERT on up to five (5) empty or populated target tables at a time. These DML
functions may be run either solo or in combinations, against one or more tables. For these reasons,
MultiLoad is the utility of choice when it comes to loading populated tables in the batch environment.
As the volume of data being loaded or updated in a single block, the performance of MultiLoad
improves. MultiLoad shines when it can impact more than one row in every data block. In other
words, MultiLoad looks at massive amounts of data and says, "Bring it on!"

Leo Tolstoy once said, "All happy families resemble each other." Like happy families, the Teradata
load utilities resemble each other, although they may have some differences. You are going to be
pleased to find that you do not have to learn all new commands and concepts for each load utility.
MultiLoad has many similarities to FastLoad. It has even more commands in common with TPump.
The similarities will be evident as you work with them. Where there are some quirky differences, we
will point them out for you.

Two MultiLoad Modes: IMPORT and DELETE
MultiLoad provides two types of operations via modes: IMPORT and DELETE. In MultiLoad
IMPORT mode, you have the freedom to "mix and match" up to twenty (20) INSERTs, UPDATEs or
DELETEs on up to five target tables. The execution of the DML statements is not mandatory for all
rows in a table. Instead, their execution hinges upon the conditions contained in the APPLY clause
of the script. Once again, MultiLoad demonstrates its user-friendly flexibility. For UPDATEs or
DELETEs to be successful in IMPORT mode, they must reference the Primary Index in the WHERE
clause.

The MultiLoad DELETE mode is used to perform a global (all AMP) delete on just one table. The
reason to use .BEGIN DELETE MLOAD is that it bypasses the Transient Journal (TJ) and can be
RESTARTed if an error causes it to terminate prior to finishing. When performing in DELETE mode,
the DELETE SQL statement cannot reference the Primary Index in the WHERE clause. This due to
the fact that a primary index access is to a specific AMP; this is a global operation.

The other factor that makes a DELETE mode operation so good is that it examines an entire block
of rows at a time. Once all the eligible rows have been removed, the block is written one time and a
checkpoint is written. So, if a restart is necessary, it simply starts deleting rows from the next block
without a checkpoint. This is a smart way to continue. Remember, when using the TJ all deleted
rows are put back into the table from the TJ as a rollback. A rollback can take longer to finish then
the delete. MultiLoad does not do a rollback; it does a restart.




Reprinted for ibmkvskumar@in.ibm.com, IBM      Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                              2




In the above diagram, monthly data is being stored in a quarterly table. To keep the contents limited
to four months, monthly data is rotated in and out. At the end of every month, the oldest month of
data is removed and the new month is added. The cycle is "add a month, delete a month, add a
month, delete a month." In our illustration, that means that January data must be deleted to make
room for May's data.


Here is a question for you: What if there was another way to accomplish this same goal without
consuming all of these extra resources? To illustrate, let's consider the following scenario: Suppose
you have TableA that contains 12 billion rows. You want to delete a range of rows based on a date
and then load in fresh data to replace these rows. Normally, the process is to perform a MultiLoad
DELETE to DELETE FROM TableA WHERE <date-column> < '2002-02-01'. The final step would be
to INSERT the new rows for May using MultiLoad IMPORT.

Block and Tackle Approach
MultiLoad never loses sight of the fact that it is designed for functionality, speed, and the ability to
restart. It tackles the proverbial I/O bottleneck problem like FastLoad by assembling data rows into
64K blocks and writing them to disk on the AMPs. This is much faster than writing data one row at a
time like BTEQ. Fallback table rows are written after the base table has been loaded. This allows
users to access the base table immediately upon completion of the MultiLoad while fallback rows
are being loaded in the background. The benefit is reduced time to access the data.

Amazingly, MultiLoad has full RESTART capability in all of its five phases of operation. Once again,
this demonstrates its tremendous flexibility as a load utility. Is it pure magic? No, but it almost
seems so. MultiLoad makes effective use of two error tables to save different types of errors and a
LOGTABLE that stores built-in checkpoint information for restarting. This is why MultiLoad does not
use the Transient Journal, thus averting time-consuming rollbacks when a job halts prematurely.

Here is a key difference to note between MultiLoad and FastLoad. Sometimes an AMP (Access
Module Processor) fails and the system administrators say that the AMP is "down" or "offline."
When using FastLoad, you must restart the AMP to restart the job. MultiLoad, however, can
continue running when an AMP fails, if the table is fallback protected. As the same time, you can
use the AMPCHECK option to make it work like FastLoad if you want.


Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                              3

MultiLoad Imposes Limits
Rule #1: Unique Secondary Indexes are not supported on a Target Table. Like FastLoad,
MultiLoad does not support Unique Secondary Indexes (USIs). But unlike FastLoad, it does support
the use of Non-Unique Secondary Indexes (NUSIs) because the index subtable row is on the same
AMP as the data row. MultiLoad uses every AMP independently and in parallel. If two AMPs must
communicate, they are not independent. Therefore, a NUSI (same AMP) is fine, but a USI (different
AMP) is not.

Rule #2: Referential Integrity is not supported. MultiLoad will not load data into tables that are
defined with Referential Integrity (RI). Like a USI, this requires the AMPs to communicate with each
other. So, RI constraints must be dropped from the target table prior to using MultiLoad.

Rule #3: Triggers are not supported at load time. Triggers cause actions on related tables based
upon what happens in a target table. Again, this is a multi-AMP operation and to a different table. To
keep MultiLoad running smoothly, disable all Triggers prior to using it.

Rule #4: No concatenation of input files is allowed. MultiLoad does not want you to do this
because it could impact are restart if the files were concatenated in a different sequence or data
was deleted between runs.

Rule #5: The host will not process aggregates, arithmetic functions or exponentiation. If you
need data conversions or math, you might be better off using an INMOD to prepare the data prior to
loading it.



 Error Tables, Work Tables and Log Tables
Besides target table(s), MultiLoad requires the use of four special tables in order to function. They
consist of two error tables (per target table), one worktable (per target table), and one log table. In
essence, the Error Tables will be used to store any conversion, constraint or uniqueness violations
during a load. Work Tables are used to receive and sort data and SQL on each AMP prior to
storing them permanently to disk. A Log Table (also called, "Logtable") is used to store successful
checkpoints during load processing in case a RESTART is needed.

HINT: Sometimes a company wants all of these load support tables to be housed in a particular
database. When these tables are to be stored in any database other than the user's own default
database, then you must give them a qualified name (<databasename>.<tablename>) in the script
or use the DATABASE command to change the current database.
Where will you find these tables in the load script? The Logtable is generally identified immediately
prior to the .LOGON command. Worktables and error tables can be named in the BEGIN MLOAD
statement. Do not underestimate the value of these tables. They are vital to the operation of
MultiLoad. Without them a MultiLoad job can not run. Now that you have had the "executive
summary", let's look at each type of table individually.

Two Error Tables: Here is another place where FastLoad and MultiLoad are similar. Both require
the use of two error tables per target table. MultiLoad will automatically create these tables. Rows
are inserted into these tables only when errors occur during the load process. The first error table is
the acquisition Error Table (ET). It contains all translation and constraint errors that may occur
while the data is being acquired from the source(s).

The second is the Uniqueness Violation (UV) table that stores rows with duplicate values for
Unique Primary Indexes (UPI). Since a UPI must be unique, MultiLoad can only load one
occurrence into a table. Any duplicate value will be stored in the UV error table. For example, you
might see a UPI error that shows a second employee number "99." In this case, if the name for
employee "99" is Kara Morgan, you will be glad that the row did not load since Kara Morgan is
already in the Employee table. However, if the name showed up as David Jackson, then you know
that further investigation is needed, because employee numbers must be unique.


Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                              4

Each error table does the following:

       • Identifies errors


       • Provides some detail about the errors


       • Stores the actual offending row for debugging



You have the option to name these tables in the MultiLoad script (shown later). Alternatively, if you
do not name them, they default to ET_<target_table_name> and UV_<target_table_name>. In
either case, MultiLoad will not accept error table names that are the same as target table names. It
does not matter what you name them. It is recommended that you standardize on the naming
convention to make it easier for everyone on your team. For more details on how these error tables
can help you, see the subsection in this chapter titled, "Troubleshooting MultiLoad Errors."

Log Table: MultiLoad requires a LOGTABLE. This table keeps a record of the results from each
phase of the load so that MultiLoad knows the proper point from which to RESTART. There is one
LOGTABLE for each run. Since MultiLoad will not resubmit a command that has been run
previously, it will use the LOGTABLE to determine the last successfully completed step.

Work Table(s): MultiLoad will automatically create one worktable for each target table. This means
that in IMPORT mode you could have one or more worktables. In the DELETE mode, you will only
have one worktable since that mode only works on one target table. The purpose of worktables is to
hold two things:

      1. The Data Manipulation Language (DML) tasks


      2. The input data that is ready to APPLY to the AMPs



The worktables are created in a database using PERM space. They can become very large. If the
script uses multiple SQL statements for a single data record, the data is sent to the AMP once for
each SQL statement. This replication guarantees fast performance and that no SQL statement will
ever be done more than once. So, this is very important. However, there is no such thing as a free
lunch, the cost is space. Later, you will see that using a FILLER field can help reduce this disk
space by not sending unneeded data to an AMP. In other words, the efficiency of the MultiLoad run
is in your hands.



 Supported Input Formats
Data input files come in a variety of formats but MultiLoad is flexible enough to handle many of
them. MultiLoad supports the following five format options: BINARY, FASTLOAD, TEXT,
UNFORMAT and VARTEXT.




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                              5

BINARY   Each record is a 2-byte integer, n, that is followed by n bytes of data. A byte is the
         smallest means of storage of for Teradata.
FASTLOAD This format is the same as Binary, plus a marker (X '0A' or X '0D') that specifies the
         end of the record.
TEXT     Each record has a random number of bytes and is followed by an end of the record
         marker.
UNFORMAT The format for these input records is defined in the LAYOUT statement of the
         MultiLoad script using the components FIELD, FILLER and TABLE.
VARTEXT This is variable length text RECORD format separated by delimiters such as a
         comma. For this format you may only use VARCHAR, LONG VARCHAR (IBM) or
         VARBYTE data formats in your MultiLoad LAYOUT. Note that two delimiter
         characters in a row will result in a null value between them.


Figure 5-1


 MultiLoad Has Five IMPORT Phases
MultiLoad IMPORT has five phases, but don't be fazed by this! Here is the short list:

       • Phase 1: Preliminary Phase


       • Phase 2: DML Transaction Phase


       • Phase 3: Acquisition Phase


       • Phase 4: Application Phase


       • Phase 5: Cleanup Phase



Let's take a look at each phase and see what it contributes to the overall load process of this
magnificent utility. Should you memorize every detail about each phase? Probably not. But it is
important to know the essence of each phase because sometimes a load fails. When it does, you
need to know in which phase it broke down since the method for fixing the error to RESTART may
vary depending on the phase. And if you can picture what MultiLoad actually does in each phase,
you will likely write better scripts that run more efficiently.


Phase 1: Preliminary Phase
The ancient oriental proverb says, "Measure one thousand times; Cut once." MultiLoad uses Phase
1 to conduct several preliminary set-up activities whose goal is to provide a smooth and successful
climate for running your load. The first task is to be sure that the SQL syntax and MultiLoad
commands are valid. After all, why try to run a script when the system will just find out during the
load process that the statements are not useable? MultiLoad knows that it is much better to identify
any syntax errors, right up front. All the preliminary steps are automated. No user intervention is
required in this phase.

Second, all MultiLoad sessions with Teradata need to be established. The default is the number of
available AMPs. Teradata will quickly establish this number as a factor of 16 for the basis regarding
the number of sessions to create. The general rule of thumb for the number of sessions to use for
smaller systems is the following: use the number of AMPs plus two more. For larger systems with
hundreds of AMP processors, the SESSIONS option is available to lower the default. Remember,

Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                              6

these sessions are running on your poor little computer as well as on Teradata.

Each session loads the data to Teradata across the network or channel. Every AMP plays an
essential role in the MultiLoad process. They receive the data blocks, hash each row and send the
rows to the correct AMP. When the rows come to an AMP, it stores them in worktable blocks on
disk. But, lest we get ahead of ourselves, suffice it to say that there is ample reason for multiple
sessions to be established.

What about the extra two sessions? Well, the first one is a control session to handle the SQL and
logging. The second is a back up or alternate for logging. You may have to use some trial and error
to find what works best on your system configuration. If you specify too few sessions it may impair
performance and increase the time it takes to complete load jobs. On the other hand, too many
sessions will reduce the resources available for other important database activities.

         Third, the required support tables are created. They are the following:
         Type of Table Table Details
         ERRORTABLES MultiLoad requires two error tables per target table. The first error
                            table contains constraint violations, while the second error table
                            stores Unique Primary Index violations.
         WORKTABLES Work Tables hold two things: the DML tasks requested and the
                            input data that is ready to APPLY to the AMPs.
         LOGTABLE           The LOGTABLE keeps a record of the results from each phase of
                            the load so that MultiLoad knows the proper point from which to
                            RESTART.



Figure 5-2

The final task of the Preliminary Phase is to apply utility locks to the target tables. Initially, access
locks are placed on all target tables, allowing other users to read or write to the table for the time
being. However, this lock does prevent the opportunity for a user to request an exclusive lock.
Although, these locks will still allow the MultiLoad user to drop the table, no one else may DROP or
ALTER a target table while it is locked for loading. This leads us to Phase 2.

Phase 2: DML Transaction Phase
In Phase 2, all of the SQL Data Manipulation Language (DML) statements are sent ahead to
Teradata. MultiLoad allows the use of multiple DML functions. Teradata's Parsing Engine (PE)
parses the DML and generates a step-by-step plan to execute the request. This execution plan is
then communicated to each AMP and stored in the appropriate worktable for each target table. In
other words, each AMP is going to work off the same page.

Later, during the Acquisition phase the actual input data will also be stored in the worktable so that it
may be applied in Phase 4, the Application Phase. Next, a match tag is assigned to each DML
request that will match it with the appropriate rows of input data. The match tags will not actually be
used until the data has already been acquired and is about to be applied to the worktable. This is
somewhat like a student who receives a letter from the university in the summer that lists his
courses, professor's names, and classroom locations for the upcoming semester. The letter is a
"match tag" for the student to his school schedule, although it will not be used for several months.
This matching tag for SQL and data is the reason that the data is replicated for each SQL statement
using the same data record.

Phase 3: Acquisition Phase
With the proper set-up complete and the PE's plan stored on each AMP, MultiLoad is now ready to
receive the INPUT data. This is where it gets interesting! MultiLoad now acquires the data in large,
unsorted 64K blocks from the host and sends it to the AMPs.



Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                              7

At this point, Teradata does not care about which AMP receives the data block. The blocks are
simply sent, one after the other, to the next AMP in line. For their part, each AMP begins to deal
with the blocks that they have been dealt. It is like a game of cards - you take the cards that you
have received and then play the game. You want to keep some and give some away.

Similarly, the AMPs will keep some data rows from the blocks and give some away. The AMP
hashes each row on the primary index and sends it over the BYNET to the proper AMP where it will
ultimately be used. But the row does not get inserted into its target table, just yet. The receiving
AMP must first do some preparation before that happens. Don't you have to get ready before
company arrives at your house? The AMP puts all of the hashed rows it has received from other
AMPs into the worktables where it assembles them into the SQL. Why? Because once the rows are
reblocked, they can be sorted into the proper order for storage in the target table. Now the utility
places a load lock on each target table in preparation for the Application Phase. Of course, there is
no Acquisition Phase when you perform a MultiLoad DELETE task, since no data is being acquired.

Phase 4: Application Phase
The purpose of this phase is to write, or APPLY, the specified changes to both the target tables and
NUSI subtables. Once the data is on the AMPs, it is married up to the SQL for execution. To
accomplish this substitution of data into SQL, when sending the data, the host has already attached
some sequence information and five (5) match tags to each data row. Those match tags are used to
join the data with the proper SQL statement based on the SQL statement within a DMP label. In
addition to associating each row with the correct DML statement, match tags also guarantee that no
row will be updated more than once, even when a RESTART occurs.

         The following five columns are the matching tags:
                                          MATCHING TAGS
         ImportSeq Sequence number that identifies the IMPORT command where the error
                     occurred
         DMLSeq Sequence number for the DML statement involved with the error
         SMTSeq Sequence number of the DML statement being carried out when the
                     error was discovered
         ApplySeq Sequence number that tells which APPLY clause was running when the
                     error occurred
         SourceSeq The number of the data row in the client file that was being built when
                     the error took place



Figure 5-3
Remember, MultiLoad allows for the existence of NUSI processing during a load. Every
hash-sequence sorted block from Phase 3 and each block of the base table is read only once to
reduce I/O operations to gain speed. Then, all matching rows in the base block are inserted,
updated or deleted before the entire block is written back to disk, one time. This is why the match
tags are so important. Changes are made based upon corresponding data and DML (SQL) based
on the match tags. They guarantee that the correct operation is performed for the rows and blocks
with no duplicate operations, a block at a time. And each time a table block is written to disk
successfully, a record is inserted into the LOGTABLE. This permits MultiLoad to avoid starting again
from the very beginning if a RESTART is needed.

What happens when several tables are being updated simultaneously? In this case, all of the
updates are scripted as a multi-statement request. That means that Teradata views them as a
single transaction. If there is a failure at any point of the load process, MultiLoad will merely need to
be RESTARTed from the point where it failed. No rollback is required. Any errors will be written to
the proper error table.

Phase 5: Clean Up Phase
Those of you reading these paragraphs that have young children or teenagers will certainly
appreciate this final phase! MultiLoad actually cleans up after itself. The utility looks at the final Error
Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                              8

Code (&SYSRC). MultiLoad believes the adage, "All is well that ends well." If the last error code is
zero (0), all of the job steps have ended successfully (i.e., all has certainly ended well). This being
the case, all empty error tables, worktables and the log table are dropped. All locks, both Teradata
and MultiLoad, are released. The statistics for the job are generated for output (SYSPRINT) and the
system count variables are set. After this, each MultiLoad session is logged off. So what happens if
the final error code is not zero? Stay tuned. Restarting MultiLoad is a topic that will be covered later
in this chapter.



 MultiLoad Commands
Two Types of Commands
You may see two types of commands in MultiLoad scripts: tasks and support functions. MultiLoad
tasks are commands that are used by the MultiLoad utility for specific individual steps as it
processes a load. Support functions are those commands that involve the Teradata utility Support
Environment (covered in Chapter 9), are used to set parameters, or are helpful for monitoring a
load.

         The chart below lists the key commands, their type, and what they do.
             MLOAD
            Command              Type            What does the MLOAD Command do?
                            Support       This command communicates directly with Teradata
         .BEGIN                           to specify if the MultiLoad mode is going to be
         [IMPORT]                         IMPORT or DELETE. Note that the word IMPORT
         MLOAD
                                          is optional in the syntax because it is the
                                          DEFAULT, but DELETE is required. We
         .BEGIN                           recommend using the word IMPORT to make the
         DELETE                           coding consistent and easier for others to read. Any
         MLOAD                            parameters for the load, such as error limits or
                                          checkpoints will be included under the .BEGIN
                                          command, too. It is important to know which
                                          commands or parameters are optional ince, if you
                                          do not include them, MultiLoad may supply defaults
                                          that may impact your load.
                            Task          The DML LABEL defines treatment options and
         .DML LABEL                       labels for the application (APPLY) of data for the
                                          INSERT, UPDATE, UPSERT and DELETE
                                          operations. A LABEL is simply a name for a
                                          requested SQL activity. The LABEL is defined first,
                                          and then referenced later in the APPLY clause.
                            Task          This instructs MultiLoad to finish the APPLY
         .END MLOAD                       operations with the changes to the designated
                                          databases and tables.
                            Task          This defines a column of the data source record that
         .FIELD                           will be sent to the Teradata database via SQL.
                                          When writing the script, you must include a FIELD
                                          for each data field you need in SQL. This command
                                          is used with the LAYOUT command.
                            Task          Do not assume that MultiLoad has somehow
         .FILLER                          uncovered much of what you used in your term
                                          papers at the university! FILLER defines a field that
                                          is accounted for as part of the data source's row
                                          format, but is not sent to the Teradata DBS. It is
                                          used with the LAYOUT command.
                            Task          LAYOUT defines the format of the INPUT DATA
         .LAYOUT                          record so Teradata knows what to expect. If one
                                          record is not large enough, you can concatenate

Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                              9

                                                    multiple data records by using the LAYOUT
                                                    parameter CONTINUEIF to tell which value to
                                                    perform for the concatenation. Another option is
                                                    INDICATORS, which is used to represent nulls by
                                                    using the bitmap (1 bit per field) at the front of the
                                                    data record.
                                  Support           This specifies the username or LOGON string that
          .LOGON                                    will establish sessions for MultiLoad with Teradata.
                                  Support           This support command names the name of the
          .LOGTABLE                                 Restart Log that will be used for storing
                                                    CHECKPOINT data pertaining to a load. The
                                                    LOGTABLE is then used to tell MultiLoad where to
                                                    RESTART, should that be necessary. It is
                                                    recommended that this command be placed before
                                                    the .LOGON command.
                                  Support           This command terminates any sessions established
          .LOGOFF                                   by the LOGON command.
                                  Task              This command defines the INPUT DATA FILE, file
          .IMPORT                                   type, file usage, the LAYOUT to use and where to
                                                    APPLY the data to SQL.
                                  Support           Optionally, you can SET utility variables. An
          .SET                                      example would be {.SET DBName TO 'CDW_Test'}.

                                  Support           This interrupts the operation of MultiLoad in order to
          .SYSTEM                                   issue commands to the local operating system.
                                  Task              This is a command that may be used with the
          .TABLE                                    .LAYOUT command. It identifies a table whose
                                                    columns (both their order and data types) are to be
                                                    used as the field names and data descriptions of the
                                                    data source records.



Figure 5-4


 Parameters for .BEGIN IMPORT MLOAD
Here is a list of components or parameters that may be used in the .BEGIN IMPORT command.
Note: The parameters do not require the usual dot prior to the command since they are actually
sub-commands.




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             10

                                                                               REQUIRED
                                   PARAMETER                                    OR NOT    WHAT IT DOES
                                                                               Optional NONE specifies
          AMPCHECK                                                                      that MLOAD starts
            {NONE|APPLY|ALL}                                                            even with one
                                                                                        down AMP per
                                                                                        cluster if all tables
                                                                                        are Fallback.

                                                                                               APPLY
                                                                                               (DEFAULT)
                                                                                               specifies MLOAD
                                                                                               will not start or
                                                                                               finish Phase 4 with
                                                                                               a down AMP.

                                                                                               ALL specifies not
                                                                                               to proceed if any
                                                                                               AMPs are down,
                                                                                               just like FastLoad.
                                                                               Optional        Short for Access
          AXSMOD                                                                               Module, this
                                                                                               command specifies
                                                                                               input protocol like
                                                                                               OLE-DB or reading
                                                                                               a tape from REEL
                                                                                               Librarian. This
                                                                                               parameter is for
                                                                                               network-attached
                                                                                               systems only.
                                                                                               When used, it
                                                                                               must precede the
                                                                                               DEFINE command
                                                                                               in the script.
                                                                               Optional        You have two
          CHECKPOINT                                                                           options:
                                                                                               CHECKPOINT
                                                                                               refers to the
                                                                                               number of
                                                                                               minutes, or
                                                                                               frequency, at
                                                                                               which you wish a
                                                                                               CHECKPOINT to
                                                                                               occur if the
                                                                                               number is 60 or
                                                                                               less. If the number
                                                                                               is greater than 60,
                                                                                               it designates the
                                                                                               number of rows at
                                                                                               which you want the
                                                                                               CHECKPOINT to
                                                                                               occur. This
                                                                                               command is NOT
                                                                                               valid in DELETE
                                                                                               mode.
                                                                               Optional        You may specify
          ERRLIMIT errcount [errpercent]                                                       the maximum
                                                                                               number of errors,
                                                                                               or the percentage,
                                                                                               that you will
                                                                                               tolerate during the
Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             11

                                                                                               processing of a
                                                                                               load job.
                                                                               Optional        Names the two
          ERRORTABLES                                                                          error tables, two
            ET_ERR UV_ERR                                                                      per target table.
                                                                                               Note there is no
                                                                                               comma separator.
                                                                               Optional        If you opt to use
          NOTIFY                                                                               NOTIFY for a any
          {LOW|MEDIUM|HIGH|OFF                                                                 event during a
                                                                                               load, you may
                                                                                               designate the
                                                                                               priority of that
                                                                                               notification:

                                                                                               LOW for level
                                                                                               events,

                                                                                               MEDIUM for
                                                                                               important events,
                                                                                               HIGH for events at
                                                                                               operational
                                                                                               decision points,
                                                                                               and OFF to
                                                                                               eliminate any
                                                                                               notification at all
                                                                                               for a given phase.
                                                                               Optional        This refers to the
          SESSIONS <MAX> <MIN>                                                                 number of
                                                                                               SESSIONS that
                                                                                               should be
                                                                                               established with
                                                                                               Teradata. For
                                                                                               MultiLoad, the
                                                                                               optimal number of
                                                                                               sessions is the
                                                                                               number of AMPs in
                                                                                               the system, plus
                                                                                               two more.

                                                                                               You can also use
                                                                                               MAX or MIN, which
                                                                                               automatically use
                                                                                               the maximum or
                                                                                               minimum number
                                                                                               of sessions to
                                                                                               complete the job. If
                                                                                               you pecify nothing,
                                                                                               it will default to
                                                                                               MAX.
                                                                               Optional        Tells MultiLoad
          SLEEP                                                                                how frequently, in
                                                                                               minutes, to try
                                                                                               logging on to the
                                                                                               system.
                                                                               Required        Names up to 5
          TABLES Tablename1, Tablename2…, Tablename5                                           target tables.
                                                                               Optional        Tells MultiLoad
          TENACITY                                                                             how many hours to
                                                                                               try logging on

Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             12

                                                                                               when its initial
                                                                                               effort to do so is
                                                                                               rebuffed.
                                                                               Optional        Names the
          WORKTABLES Tablename1, Tablename2…, Tablename5                                       worktable(s), one
                                                                                               per target table.



Figure 5-5


 Parameters for .BEGIN DELETE MLOAD
Here is a list of components or parameters that may be used in the BEGIN DELETE command.
Note: The parameters do not require the usual dot prior to the command since parameters are
actually sub-commands.



 A Simple Multiload IMPORT Script

         "We must use time as a tool, not as a crutch."
         – John F. Kennedy

Ask Not – What your Multiload can do for you. Ask what you can do for your Multiload. Multiload is a
great tool when you're short on time. Multiload can update, insert, delete or upsert on Teradata
tables that are already populated. It can even do all four in one script. Our flatfile will contain
Employee_numbers and Salaries * 2. We are giving a big raise. We're going to create a flat file to
use with Multiload, as shown below:

Let's create a flat file for our Multiload




Let's Execute it:




Remember, we'll still use the BTEQ utility to create our flat file.
Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             13




 Building our Multiload Script

         "I can accept failure, but I can't accept not trying."
         - Michael Jordan

Getting these scripts down is a very hard process, so don't be discouraged if you have a couple of
mistakes. The next two slides will show you a blank copy of the basic Multiload script, as well as a
marked slide illustrating the important parts of the script:




         "If you don't know where you're going, any road will take you there."
         - Lewis Carrol

Creating our Multiload script
Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             14




 Executing Multiload

         "Ambition is a dream with a V8 Engine."
         - Elvis Presley

You will feel like the King after executing your first Multiload script. Multiload is the Elvis Presley of
data warehousing because nobody knows how make more records then Multiload. If you have the
ambition to learn, this book will give you what it takes to steer through these utilities. We initialize
the Multiload utility like we do with BTEQ, except that the keyword with Multiload Is mload.
Remember that this Multiload is going to double the salaries of our employees.

Let's execute our Multiload script




Here is a before and after image of our Employee_table02:




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             15




 Another Simple MultiLoad IMPORT Script

         "Those who dance are considered insane by those who cannot hear the music."
         - George Carlin

MultiLoad can be somewhat intimidating to the new user because there are many commands and
phases. In reality, the load scripts are understandable when you think through what the IMPORT
mode does:

       • Setting up a Logtable


       • Logging onto Teradata


       • Identifying the Target, Work and Error tables


       • Defining the INPUT flat file


       • Defining the DML activities to occur


       • Naming the IMPORT file


       • Telling MultiLoad to use a particular LAYOUT


Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             16

       • Telling the system to start loading


       • Finishing loading and logging off of Teradata



This first script example is designed to show MultiLoad IMPORT in its simplest form. It depicts the
loading of a three-column Employee table. The actual script is in the left column and our comments
are on the right. Below the script is a step-by-step description of how this script works.

Step One: Setting up a Logtable and Logging onto Teradata — MultiLoad requires you specify a
log table right at the outset with the .LOGTABLE command. We have called it CDW_Log. Once you
name the Logtable, it will be automatically created for you. The Logtable may be placed in the same
database as the target table, or it may be placed in another database. Immediately after this you log
onto Teradata using the .LOGON command. The order of these two commands is interchangeable,
but it is recommended to define the Logtable first and then to Log on, second. If you reverse the
order, Teradata will give a warning message. Notice that the commands in MultiLoad require a dot
in front of the command key word.

Step Two: Identifying the Target, Work and Error tables — In this step of the script you must tell
Teradata which tables to use. To do this, you use the .BEGIN IMPORT MLOAD command. Then
you will preface the names of these tables with the sub-commands TABLES, WORKTABLES AND
ERROR TABLES. All you must do is name the tables and specify what database they are in. Work
tables and error tables are created automatically for you. Keep in mind that you get to name and
locate these tables. If you do not do this, Teradata might supply some defaults of its own!

At the same time, these names are optional. If the WORKTABLES and ERRORTABLES had not
specifically been named, the script would still execute and build these tables. They would have
been built in the default database for the user. The name of the worktable would be
WT_EMPLOYEE_DEPT1 and the two error tables would be called ET_EMPLOYEE_DEPT1 and
UV_EMPLOYEE_DEPT1, respectively.

Sometimes, large Teradata systems have a work database with a lot of extra PERM space. One
customer calls this database CORP_WORK. This is where all of the logtables and worktables are
normally created. You can use a DATABASE command to point all table creations to it or qualify the
names of these tables individually.

Step Three: Defining the INPUT flat file record structure — MultiLoad is going to need to know
the structure the INPUT flat file. Use the .LAYOUT command to name the layout. Then list the fields
and their data types used in your SQL as a .FIELD. Did you notice that an asterisk is placed
between the column name and its data type? This means to automatically calculate the next byte in
the record. It is used to designate the starting location for this data based on the previous fields
length. If you are listing fields in order and need to skip a few bytes in the record, you can either use
the .FILLER (like above) to position to the cursor to the next field, or the "*" on the Dept_No field
could have been replaced with the number 132 (CHAR(11)+CHAR(20)+CHAR(100)+1). Then, the
.FILLER is not needed. Also, if the input record fields are exactly the same as the table, the .TABLE
can be used to automatically define all the .FIELDS for you. The LAYOUT name will be referenced
later in the .IMPORT command. If the input file is created with INDICATORS, it is specified in the
LAYOUT.

Step Four: Defining the DML activities to occur — The .DML LABEL names and defines the
SQL that is to execute. It is like setting up executable code in a programming language, but using
SQL. In our example, MultiLoad is being told to INSERT a row into the SQL01.Employee_Dept
table. The VALUES come from the data in each FIELD because it is preceded by a colon (:). Are
you allowed to use multiple labels in a script? Sure! But remember this: Every label must be
referenced in an APPLY clause of the .IMPORT clause.

Step Five: Naming the INPUT file and its format type — This step is vital! Using the .IMPORT
command, we have identified the INFILE data as being contained in a file called
"CDW_Join_Export.txt". Then we list the FORMAT type as TEXT. Next, we referenced the
LAYOUT named FILEIN to describe the fields in the record. Finally, we told MultiLoad to APPLY the

Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             17

DML LABEL called INSERTS — that is, to INSERT the data rows into the target table. This is still a
sub-component of the .IMPORT MLOAD command. If the script is to run on a mainframe, the
INFILE name is actually the name of a JCL Data Definition (DD) statement that contains the real
name of the file.

Notice that the .IMPORT goes on for 4 lines of information. This is possible because it continues
until it finds the semi-colon to define the end of the command. This is how it determines one
operation from another. Therefore, it is very important or it would have attempted to process the
END LOADING as part of the IMPORT — it wouldn't work.

Step Six: Finishing loading and logging off of Teradata — This is the closing ceremonies for the
load. MultiLoad to wrap things up, closes the curtains, and logs off of the Teradata system.

         Important note: Since the script above in Figure 5-6 does not DROP any tables, it is
         completely capable of being restarted if an error occurs. Compare this to the next
         script in Figure 5-7. Do you think it is restartable? If you said no, pat yourself on the
         back.


                                          REQUIRED
          PARAMETER                       OR NOT               WHAT IT DOES
                                          Required             Names the Target table.
          TABLES Tablename1
                                          Optional             Names the worktable one per target table.
          WORKTABLES Tablename1
                                          Optional             Names the two error tables, two per target
          ERRORTABLES                                          table and there is no comma separator
            ET_ERR UV_ERR                                      between them.
                                          Optional             Tells MultiLoad how many hours to try
          TENACITY                                             establishing sessions when its initial effort
                                                               to do so is rebuffed.


         Figure 5-6

                                                                          Sets Up a Logtable and
                   /* Simple Mload script                         */      Logs on to Teradata
                   .LOGTABLE SQL01.CDW_Log;
                   .LOGON TDATA/SQL01,SQL0;
                                                      Begins the Load Process
                   .BEGIN IMPORT MLOAD TABLES         by naming the Target
                                   SQL01.Employee_Dept1
                                                      Table, Work table and
                      WORKTABLES   SQL01.CDW_WT
                      ERRORTABLES SQL01.CDW_ET        error tables; Notice NO
                                   SQL01.CDW_UV;      comma between the error
                                                      tables
                                                      Names the LAYOUT of the
                   .LAYOUT FILEIN;                    INPUT record and defines
                      .FIELD Employee_No * CHAR(11);  its structure; Notice the
                      .FIELD Last_Name   * CHAR(20);
                      .FILLER Junk_stuff * CHAR(100); dots before the FIELD and
                      .FIELD Dept_No     * CHAR(6);   FILLER and the
                                                      semi-colons after each
                                                      definition.
                                                      Names the DML Label
                   .DML LABEL INSERTS;
                                                                          Tells MultiLoad to INSERT
                   INSERT INTO SQL01.Employee_Dept1                       a row into the target table
                      (Employee_No                                        and defines the row
                      ,Last_Name
                      ,Dept_No )                                          format.


Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             18

                   VALUES                                                 Lists, in order, the
                      (:Employee_No                                       VALUES (each one
                      ,:Last_Name
                      ,:Dept_No );                                        preceded by a colon) to
                                                                          be INSERTed.
                                                                          Names the Import File and
                   .IMPORT INFILE CDW_Join_Export.txt                     its Format type; Cites the
                       FORMAT TEXT                                        LAYOUT file to use tells
                       LAYOUT FILEIN
                       APPLY INSERTS;                                     Mload to APPLY the
                                                                          INSERTs.
                                                                          Ends MultiLoad and Logs
                   .END MLOAD;                                            off all MultiLoad sessions
                   .LOGOFF;




         Figure 5-7



 MultiLoad IMPORT Script
Let's take a look at MultiLoad IMPORT script that comes from real life. This sample script will look
much more like what you might encounter at your workplace. It is more detailed. The notes to the
right are brief and too the point. They will help you can grasp the essence of what is happening in
the script.

                                                                                        Load Runs from a
          /* !/bin/ksh*                                     */                          Shell Script
                                                                                        Any words between /*
          /* +++++++++++++++++++++++++++++++++++++*/                                    … */ are comments
          /* MultiLoad SCRIPT                     */                                    only and are not
          /*This script is designed to change the */
          /*EMPLOYEE_DEPT1 table using the data found */                                processed by
          /* in IMPORT INFILE CDW_Join_Export.txt */                                    Teradata.
          /* Version 1.1                          */
          /* Created by Coffing Data Warehousing */            Names and describes
          /* +++++++++++++++++++++++++++++++++++++*/           the purpose of the
                                                               script; names the
                                                               author
                                                               Secures the logon by
          .LOGTABLE SQL01.CDW_Log;                             storing userid and
          .RUN FILE LOGON.TXT;                                 password in a separate
          /*Drop Error Tables — caution, this script cannot be file, then reads it.
          restarted because these tables would be needed */
            DROP TABLE SQL01.CDW_ET;                          Drops Existing error
            DROP TABLE SQL01.CDW_UV;                          tables and cancels the
                                                              ability for the script to
                                                              restart – DON'T
                                                              ATTEMPT THIS AT
                                                              HOME! Also, SQL does
                                                              not use a dot (.)
                                                              Begins the Load
          /* Begin Import and Define Work and Error Tables */ Process by telling us
          .BEGIN IMPORT MLOAD TABLES                          first the names of the
                           SQL01.Employee_Dept1
                WORKTABLES                                    target table, Work table
                           SQL01.CDW_WT                       and error tables; note
                ERRORTABLES                                   NO comma between
                           SQL01.CDW_ET                       the names of the error
                           SQL01.CDW_UV;                      tables
                                                              Names the LAYOUT of
          /* Define Layout of Input File */                   the INPUT file.
Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             19

          .LAYOUT FILEIN;                                                               Defines the structure
                  .FIELD Employee_No              *   CHAR(11);                         of the INPUT file.
                  .FIELD First_Name               *   CHAR(14);
                  .FIELD Last_Name                *   CHAR(20);                         Notice the dots before
                  .FIELD Dept_No                  *   CHAR(6);                          the FIELD command
                  .FIELD Dept_Name                *   CHAR(20);                         and the semi-colons
                                                                                        after each FIELD
                                                                                        definition.

                                                                                        Names the DML Label
          /* Begin INSERT Process on Table */
          .DML LABEL INSERTS;                                                           Tells MultiLoad to
            INSERT INTO SQL01.Employee_Dept1
                ( Employee_No                                                           INSERT a row into the
                 ,First_Name                                                            target table and
                 ,Last_Name                                                             defines the row format.
                 ,Dept_No
                 ,Dept_Name )                                                           Note that we place
               VALUES
               ( :Employee_No                                                           comma separators in
                ,:First_Name                                                            front of the following
                ,:Last_Name                                                             column or value for
                ,:Dept_No                                                               easier debugging.
                ,:Dept_Name );
                                                                                        Lists, in order, the
                                                                                        VALUES to be
                                                                                        INSERTed.
                                                                                        Names the Import File
          /* Specify IMPORT File and Apply Parameters */                                and States its Format
          .IMPORT INFILE CDW_Join_Export.txt                                            type; Names the
                  FORMAT TEXT
                  LAYOUT FILEIN                                                         Layout file to use And
                  APPLY INSERTS;                                                        tells MultiLoad to
                                                                                        APPLY the INSERTs.
                                                                                        Ends MultiLoad and
          .END MLOAD;                                                                   Logs off of Teradata
          .LOGOFF;




Figure 5-8


 Error Treatment Options for the .DML LABEL Command
MultiLoad allows you to tailor how it deals with different types of errors that it encounters during the
load process, to fit your needs. Here is a summary of the options available to you:




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             20

 ERROR TREATMENT OPTIONS FOR .DML LABEL

.DML LABEL {labelname}
   {MARK | IGNORE} DUPLICATE [INSERT |UPDATE] ROWS
   {MARK | IGNORE} MISSING [INSERT |UPDATE] ROWS
      DO INSERT FOR [MISSING UPDATE] ROWS ;


Figure 5-9
In IMPORT mode, you may specify as many as five distinct error-treatment options for one
.DML statement. For example, if there is more than one instance of a row, do you want MultiLoad to
IGNORE the duplicate row, or to MARK it (list it) in an error table?

If you do not specify IGNORE, then MultiLoad will MARK, or record all of the errors. Imagine you
have a standard INSERT load that you know will end up recording about 20,000 duplicate row
errors. Using the following syntax "IGNORE DUPLICATE INSERT ROWS;" will keep them out of the
error table. By ignoring those errors, you gain three benefits:

      1. You do not need to see all the errors.


      2. The error table is not filled up needlessly.


      3. MultiLoad runs much faster since it is not conducting a duplicate row check.



When doing an UPSERT, there are two rules to remember:



       • The default is IGNORE MISSING UPDATE ROWS. Mark is the default for all operations.
         When doing an UPSERT, you anticipate that some rows are missing, otherwise, why do an
         UPSERT. So, this keeps these rows out of your error table.


       • The DO INSERT FOR MISSING UPDATE ROWS is mandatory. This tells MultiLoad to
         insert a row from the data source if that row does not exist in the target table because the
         update didn't find it.




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             21

         The table that follows shows you, in more detail, how flexible your options are:
                              ERROR TREATMENT OPTIONS IN DETAIL
             .DML LABEL                                  WHAT IT DOES
                OPTION
         MARK DUPLICATE This option logs an entry for all duplicate INSERT rows in the
         INSERT ROWS             UV_ERR table. Use this when you want to know about the
                                 duplicates.
         IGNORE DUPLICATE This tells MultiLoad to IGNORE duplicate INSERT rows
         INSERT ROWS             because you do not want to see them.
         MARK DUPLICATE This logs the existence of every duplicate UPDATE row.
         UPDATE ROWS
         IGNORE DUPLICATE This eliminates the listing of duplicate update row errors.
         UPDATE ROWS
         MARK MISSING            This option ensures a listing of data rows that had to be
         UPDATE ROWS             INSERTed since there was no row to UPDATE.
         IGNORE MISSING          This tells MultiLoad NOT to list UPDATE rows as an error.
         UPDATE ROWS             This is a good option when doing an UPSERT since
                                 UPSERT will INSERT a new row.
         MARK MISSING            This option makes a note in the ET_Error Table that a row to
         DELETE ROWS             be deleted is missing.
         IGNORE MISSING          This option says, "Do not tell me that a row to be deleted is
         DELETE ROWS             missing.
         DO INSERT for           This is required to accomplish an UPSERT. It tells MultiLoad
         MISSING UPDATE          that if the row to be updated does not exist in the target table,
         ROWS                    then INSERT the entire row from the data source.



Figure 5-10


 An IMPORT Script with Error Treatment Options
The command .DML LABEL names any DML options (INSERT, UPDATE OR DELETE) that
immediately follow it in the script. Each label must be given a name. In IMPORT mode, the label will
be referenced for use in the APPLY Phase when certain conditions are met. The following script
provides an example of just one such possibility:

                                                                                        Load Runs from a
          /* !/bin/ksh*                                        */                       Shell Script
                                                                                        Any words between /*
          /* +++++++++++++++++++++++++++++++++++++*/                                    … */ are COMMENTS
          /* MultiLoad SCRIPT                      */                                   ONLY and are not
          /*This script is designed to change the */
          /*EMPLOYEE_DEPT table using the data from */                                  processed by
          /* the IMPORT INFILE CDW_Join_Export.txt */                                   Teradata.
          /* Version 1.1                              */
          /* Created by Coffing Data Warehousing*/             Names and describes
          /* +++++++++++++++++++++++++++++++++++++ */          the purpose of the
                                                               script; names the
                                                               author
                                                               Sets up a Logtable and
          /* Setup the MulitLoad Logtables, Logon Statements*/ then logs on to
          .LOGTABLE SQL01.CDW_Log;                             Teradata.
          .LOGON TDATA/SQL01,SQL01;
          DATABASE SQL01;                                                               Specifies the database
                                                                                        in which to find the
                                                                                        target table.

Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             22

          /*Drop Error Tables */                              Drops Existing error
            DROP TABLE WORKDB.CDW_ET;                         tables in the work
            DROP TABLE WORKDB.CDW_UV;
                                                              database.
                                                              Begins the Load
          /* Begin Import and Define Work and Error Tables */ Process by telling us
          .BEGIN IMPORT MLOAD TABLES                          first the names of the
                      Employee_Dept
                WORKTABLES                                    Target Table, Work
                      WORKDB.CDW_WT                           table and error tables
                ERRORTABLES                                   are in a work database.
                      WORKDB.CDW_ET                           Note there is no
                      WORKDB.CDW_UV;                          comma between the
                                                              names of the error
                                                              tables (pair).
                                                              Names the LAYOUT of
          /* Define Layout of Input File */                   the INPUT file.
          .LAYOUT FILEIN;
              .FIELD Employee_No            *   CHAR(11);                               Defines the structure
              .FIELD First_Name             *   CHAR(14);                               of the INPUT file.
              .FIELD Last_Name              *   CHAR(20);                               Notice the dots before
              .FIELD Dept_No                *   CHAR(6);                                the FIELD command
              .FIELD Dept_Name              *   CHAR(20);                               and the semi-colons
                                                                                        after each FIELD
                                                                                        definition.

                                                                                        Names the DML Label
          /* Begin INSERT Process on Table */
          .DML LABEL INSERTS                                                            Tells MultiLoad NOT
          IGNORE DUPLICATE INSERT ROWS;
            INSERT INTO SQL01.Employee_Dept                                             TO LIST duplicate
                ( Employee_No                                                           INSERT rows in the
                 ,First_Name                                                            error table; notice the
                 ,Last_Name                                                             option is placed
                 ,Dept_No                                                               AFTER the LABEL
                 ,Dept_Name)
            VALUES                                                                      identification and
                ( :Employee_No                                                          immediately BEFORE
                 ,:First_Name,                                                          the DML function.
                 ,:Last_Name,
                 ,:Dept_No,                                                             Lists, in order, the
                 ,:Dept_Name);
                                                                                        VALUES to be
                                                                                        INSERTed.
                                                                                        Names the Import File
          /* Specify IMPORT File and Apply Parameters */                                and States its Format
          .IMPORT INFILE CDW_Join_Export.txt                                            type; names the
              FORMAT TEXT
              LAYOUT FILEIN                                                             Layout file to use and
              APPLY INSERTS;                                                            tells MultiLoad to
                                                                                        APPLY the INSERTs.
                                                                                        Ends MultiLoad and
          .END MLOAD;                                                                   logs off of Teradata
          .LOGOFF;




Figure 5-11


 A IMPORT Script that Uses Two Input Data Files




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             23

                                                                                         Load Runs from a
          /* !/bin/ksh*                                                         */       Shell Script
                                                                                         Any words between /*
          /*MultiLoad IMPORT SCRIPT with two INPUT files                        */       … */ are comments
          */                                                                             only and are not
          /*This script INSERTs new rows into the                               */
          /* Employee_table and UPDATEs the Dept_Name                           */       processed by
          /*in the Department_table.                                            */       Teradata.
          /* Version 1.1                                                        */
          /* Created by Coffing Data Warehousing                                */
          /* +++++++++++++++++++++++++++++++++++++*/
                                                                                         Sets up a Logtable
          .LOGTABLE SQL01.EMPDEPT_LOG;                                                   and logs on with
          .RUN FILE c:\mydir\logon.txt;                                                  .RUN.
                                                                                         The logon.txt file
                                                                                         contains:

                                                                                         .logon
                                                                                         TDATA/SQL01,SQL01;
                                                                                         Drops the worktables
          DROP   TABLE    SQL01.EMP_WT;                                                  and error tables, in
          DROP   TABLE    SQL01.DEPT_WT;                                                 case they existed
          DROP   TABLE    SQL01.EMP_ET;
          DROP   TABLE    SQL01.EMP_UV;                                                  from a prior load;
          DROP   TABLE    SQL01.DEPT_ET;                                                 NOTE: Do NOT
          DROP   TABLE    SQL01.DEPT_UV;                                                 include IF you want to
                                                                                         RESTART using
                                                                                         CHECKPOINT.
                                                                                         Identifies the 2 target
          /* the following defines 2 tables for loading */                               tables with a comma
          .BEGIN IMPORT MLOAD
                                                                                         between them.
             TABLES
                    SQL01.Employee_Table,                                                Names the worktable
                    SQL01.Department_Table                                               and error tables for
             WORKTABLES                                                                  each target table;
                    SQL01.EMP_WT,
                    SQL01.DEPT_WT
             ERRORTABLES                                        Note there are NO
                    SQL01.EMP_ET                                commas between the
                    SQL01.EMP_UV,                               pair of names, but
                    SQL01.DEPT_ET                               there is a comma
                    SQL01.DEPT_UV;                              between this pair and
                                                                the next pair.
                                                                Names and Defines
          /* these next 2 LAYOUTs define 2 different records */ the LAYOUT of the 1st
          .LAYOUT FILEIN1;                                      INPUT file
              .FIELD Emp_No     * INTEGER;
                .FIELD      LName         *   CHAR(20);
                .FIELD      FName         *   VARCHAR(20);
                .FIELD      Sal           *   DECIMAL (10,2);
                .FIELD      Dept_Num      *   INTEGER;
                                                                                         Names and Defines
          .LAYOUT FILEIN2;                                                               the LAYOUT of the
              .FIELD DeptNo               * CHAR(6);                                     2nd INPUT file
              .FIELD DeptName             * CHAR(20);
                                                                                         Names the 1st DML
          .DML LABEL EMP_INS                                                             Label; Tells MultiLoad
            IGNORE DUPLICATE INSERT ROWS;                                                to IGNORE duplicate
            INSERT INTO SQL01.Employee_Table
            VALUES (:Emp_No                                                              INSERT rows because
                   ,:FName                                                               you do not want to
                   ,:LName                                                               see them.
                   ,:Sal
                   ,:Dept_Num);                                                          INSERT a row into the
                                                                                         table, but does NOT
                                                                                         name the columns. So
Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             24

                                                                                         all VALUES are
                                                                                         passed IN THE
                                                                                         ORDER they are
                                                                                         defined in the
                                                                                         Employee table.

                                                                                         Names the 2nd DML
          .DML LABEL DEPT_UPD;                                                           Label;
           UPDATE   Department_Table
             SET    Dept_Name = :DeptName
             WHERE Dept_No = :DeptNo;                                                    Tells MultiLoad to
                                                                                         UPDATE when it finds
                                                                                         Deptno (record) equal
                                                                                         to the Dept_No in the
                                                                                         Department_table and
                                                                                         change the
                                                                                         Dept_name column
                                                                                         with the DeptName
                                                                                         from the INPUT file.
                                                                                         Names the TWO
          .IMPORT INFILE Emp_Data                                                        Import Files
            LAYOUT FILEIN1
            APPLY EMP_INS;
                                                                                         Names the TWO
          .IMPORT INFILE Dept_Data                                                       Layouts that define
            LAYOUT FILEIN2                                                               the structure of the
            APPLY DEPT_UPD;                                                              INPUT DATA files …
                                                                                         and tells MultiLoad to
                                                                                         APPLY the INSERTs
                                                                                         to target table 1 and
                                                                                         the UPDATEs to
                                                                                         target table 2.
                                                                                         Ends MultiLoad and
          .END MLOAD;                                                                    logs off of Teradata.
          .LOGOFF;




Figure 5-12


 Redefining the INPUT
Sometimes, instead of using two different INPUT DATA files, which require two separate LAYOUTs,
you can combine them into one INPUT DATA file. And you can use that one file, with just one
LAYOUT to load more than one table! You see, a flat file may contain more than one type of data
record. As long as each record has a unique code to identify it, MultiLoad can check this code and
know which layout to use for using different names in the same layout. To do this you will need to
REDEFINE the INPUT. You do this by redefining a field's position in the .FIELD or .FILLER section
of the LAYOUT. Unlike the asterisk (*), which means that a field simply follows the previous one,
redefining will cite a number that tells MultiLoad to take a certain portion of the INPUT file and jump
to the redefined position to back toward the beginning of the record.



 A Script that Uses Redefining the Input
The following script uses the ability to define two record types in the same input data file. It uses a
.FILLER to define the code since it is never used in the SQL, only to determine which SQL to run.




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             25

                                                                                         Load Runs from a
          /* !/bin/ksh*                                       */                         Shell Script
                                                                                         Any words between /*
          /* +++++++++++++++++++++++++++++++++++++*/                                     … */ are comments
          /* MultiLoad IMPORT SCRIPT with multiple target */                             only and are not
          /*tables and DML labels                 */
          /*This script INSERTs new rows into the */                                     processed by
          /* Employee_table and UPDATEs the Dept_Name */                                 Teradata.
          /*in the Department_table               */
          /* Version 1.1                          */
          /* Created by Coffing Data Warehousing */
          /* +++++++++++++++++++++++++++++++++++++*/
                                                                Sets Up a Logtable
          .LOGTABLE SQL01.EmpDept_Log;                          and Logs on to
          .LOGON TDATA/SQL01,SQL01;                             Teradata; Optionally,
                                                                specifies the
                                                                database to work in.
                                                                Identifies the 2 target
          /* 2 target tables, 2 work tables, 2 error tables per tables;
          target table, defined in pairs                             */
          BEGIN IMPORT MLOAD
              TABLES                                                                     Names the worktable
                     SQL01.Employee_Table,                                               and error tables for
                     SQL01.Department_Table                                              each target tables;
              WORKTABLES
                     SQL01.EMP_WT,                                                       Note there is no
                     SQL01.DEPT_WT
              ERRORTABLES                                                                comma between the
                     SQL01.EMP_ET                                                        names of the error
                     SQL01.EMP_UV,                                                       tables but there is a
                     SQL01.DEPT_ET                                                       comma between the
                     SQL01 .DEPT_UV;                                                     pair of error tables.
                                                                                         Names and defines
          .LAYOUT FILEIN;                                                                the LAYOUT of the
            .FILLER Trans                 *    CHAR (1);                                 INPUT record. The
            .FIELD   Emp_No               *    INTEGER;
            .FIELD   Dept_Num             *    INTEGER;                                  FILLER is for a field
            .FIELD   LName                *    CHAR(20);                                 that tells what type of
            .FIELD   FName                *    VARCHAR(20);                              record has been read.
            .FIELD   Sal                  *    DECIMAL (10,2);                           Here that field
            .FIELD   DeptNo               2     INTEGER;                                 contains an "E" or a
            .FIELD   DeptName             *    CHsssssssAR(20);
                                                                                         "D". The "E" tells
                                                                                         MLOAD use the
                                                                                         Employee data and
                                                                                         the "D" is for
                                                                                         department data.

                                                                                         The definition for
                                                                                         Dept_Num tells
                                                                                         MLOAD to jump
                                                                                         backward to byte 2.
                                                                                         Where as the * for
                                                                                         Emp_Num defaulted
                                                                                         to byte 2. So, Emp_No
                                                                                         and Dept_Num both
                                                                                         start at byte 2, but in
                                                                                         different types of
                                                                                         records. When Trans
                                                                                         (byte position 1)
                                                                                         contains a "D", the
                                                                                         APPLY uses the dept
                                                                                         names and for an "E"
                                                                                         the APPLY uses the
                                                                                         employee data.


Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             26

          .DML LABEL EMPIN                                                               Names the 1st DML
            IGNORE DUPLICATE INSERT ROWS;                                                Label; Tells MultiLoad
          INSERT INTO SQL01.Employee_Table                                               to IGNORE duplicate
            VALUES ( :Emp_No                                                             INSERT rows because
                    ,:FName                                                              you do not want to
                    ,:LName                                                              see them.
                    ,:Sal
                    ,:Dept_Num );
                                                                                         Tells MultiLoad to
                                                                                         INSERT a row into the
                                                                                         1st target table but
                                                                                         optionally does NOT
                                                                                         define the target table
                                                                                         row format. All the
                                                                                         VALUES are passed
                                                                                         to the columns of the
                                                                                         Employee table IN
                                                                                         THE ORDER of that
                                                                                         table's row format.
                                                                                         Names the 2nd DML
          .DML LABEL DEPTIN;                                                             Label;
          UPDATE Department_Table
            SET   Dept_Name = :DeptName
            WHERE Dept_No = :DeptNo;                                                     Tells MultiLoad to
                                                                                         UPDATE the 2nd
                                                                                         target table but
                                                                                         optionally does NOT
                                                                                         define that table's row
                                                                                         format. When the
                                                                                         VALUE of the DeptNo
                                                                                         equals that of the
                                                                                         Dept_No column of
                                                                                         the Department, then
                                                                                         update the
                                                                                         Dept_Name column
                                                                                         with the DeptName
                                                                                         from the INPUT file.
                                                                                         Ends MultiLoad and
          .IMPORT INFILE UPLOAD.dat                                                      logs off of Teradata.
             LAYOUT FILEIN
             APPLY EMPIN WHERE Trans = 'E'
             APPLY DEPTIN WHERE Trans = 'D' ;
          .END MLOAD;
          .LOGOFF;




Figure 5-13


 DELETE MLOAD Script Using a Hard Coded Value
The next script demonstrates how to use the MultiLoad DELETE task. In this example, students no
longer enrolled in the university are being removed from the Student_Profile table, based upon the
registration date. The profile of any student who enrolled prior to this date will be removed.




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             27

.LOGTABLE RemoveLog;                             Identifies the Logtable and logs onto Teradata with a valid
.LOGON TDATA/SQL01,SQL01;                        logon string.
                                                 Begins MultiLoad in DELETE mode and Names the target
.BEGIN DELETE MLOAD                              table.
   TABLES Order_Table;
                                                 SQL DELETE statement does a massive delete of order
DELETE FROM Order_Table                          data for orders placed prior to the hard coded date in the
 WHERE Order_Date < '99/12/31';                  WHERE clause. Notice that this is not the Primary Index.
                                                 You CANNOT DELETE in DELETE MLOAD mode based
                                                 upon the Primary Index.
                                                 Ends loading and logs off of Teradata.
.END MLOAD;
 LOGOFF;


Figure 5-14
How many differences from a MultiLoad IMPORT script readily jump off of the page at you? Here
are a few that we saw:



       • At the beginning, you must specify the word "DELETE" in the .BEGIN MLOAD command.
         You need not specify it in the .END MLOAD command.


       • You will readily notice that this mode has no .DML LABEL command. Since it is focused on
         just one absolute function, no APPLY clause is required so you see no .DML LABEL.


       • Notice that the DELETE with a WHERE clause is an SQL function, not a MultiLoad
         command, so it has no dot prefix.


       • Since default names are available for worktables (WT_<target_tablename>) and error tables
         (ET_<target_tablename> and UV_<target_tablename>), they need not be specifically
         named, but be sure to define the Logtable.



Do not confuse the DELETE MLOAD task with the SQL delete task that may be part of a MultiLoad
IMPORT. The IMPORT delete is used to remove small volumes of data rows based upon the
Primary Index. On the other hand, the MultiLoad DELETE does global deletes on tables, bypassing
the Transient Journal. Because there is no Transient Journal, there are no rollbacks when the job
fails for any reason. Instead, it may be RESTARTed from a CHECKPOINT. Also, the MultiLoad
DELETE task is never based upon the Primary Index.

Because we are not importing any data rows, there is neither a need for worktables nor an
Acquisition Phase. One DELETE statement is sent to all the AMPs with a match tag parcel. That
statement will be applied to every table row. If the condition is met, then the row is deleted. Using
the match tags, each target block is read once and the appropriate rows are deleted.



 A DELETE MLOAD Script Using a Variable
This illustration demonstrates how passing the values of a data row rather than a hard coded value
may be used to help meet the conditions stated in the WHERE clause. When you are passing
values, you must add some additional commands that were not used in the DELETE example with
hard coded values.



Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             28

.LOGTABLE RemoveLog;                       Identifies the Logtable and logs onto Teradata with a valid
.LOGON TDATA/SQL01,SQL01;                  logon string.
                                           Begins the DELETE task and names only one table, but still
.BEGIN DELETE MLOAD                        uses TABLES option.
      TABLES Order_Table;
                             Names the LAYOUT and defines the column whose value will
.LAYOUT OldMonth             be passed as a single row to MultiLoad. In this case, all of the
 .FIELD OrdDate * DATE;      order dates in the Order_Table will be tested against this
                             OrdDate value.
                             The condition in the WHERE clause is that the data rows with
DELETE FROM Order_Table      orders placed prior to the date value (:OrdDate) passed from
WHERE Order_Date < :OrdDate; the LAYOUT OldMonth will be DELETEd from the
                             Order_Table.
                             Note that this time there is no dot in front of LAYOUT in this
.IMPORT INFILE               clause since it is only being referenced.
      LAYOUT OldMonth ;
                                           Ends loading and logs off of Teradata.
.END MLOAD;
.LOGOFF;


Figure 5-15


 An UPSERT Sample Script
The following sample script is provided to demonstrate how to do an UPSERT — that is, to update
a table and if a row from the data source table does not exist in the target table, then insert a new
row. In this instance we are loading the Student_Profile table with new data for the next semester.
The clause "DO INSERT FOR MISSING UPDATE ROWS" indicates an UPSERT. The DML
statements that follow this option must be in the order of a single UPDATE statement followed by a
single INSERT statement.

                                                                                              Load Runs from a
          /* !/bin/ksh*                                        */ shell script; Any
          /* +++++++++++++++++++++++++++++++++++++++++++++++++ */ words between /*
          /* MultiLoad UPSERT SCRIPT                           */
          /*This script Updates the Student_Profile Table      */ … */ are comments
          /* with new data and Inserts a new row into the table */ only and are not
          /* if the row to be updated does not exist.          */ processed by
          /* Version 1.1                                        */ Teradata;
          /* Created by Coffing Data Warehousing                */
          /* ++++++++++++++++++++++++++++++++++++++++++++++++++*/
                                                                                              Names and
                                                                                              describes the
                                                                                              purpose of the
                                                                                              script; names the
                                                                                              author.
                                                                                              Sets Up a Logtable
          /* Setup Logtable, Logon Statements*/                                               and then logs on
                                                                                              to Teradata.
          .LOGTABLE SQL01.CDW_Log;
          .LOGON CDW/SQL01,SQL01;
                                                                                              Begins the Load
          /* Begin Import and Define Work and Error Tables */                                 Process by telling
          .BEGIN IMPORT MLOAD TABLES
                                                                                              us first the names
             SQL01.Student_Profile                                                            of the target table,
             WORKTABLES SQL01.SWA_WT                                                          work table and
             ERRORTABLES SQL01.SWA_ET                                                         error tables.
                         SQL01.SWA_UV;
                                                                                              Names the
          /* Define Layout of Input File */                                                   LAYOUT of the
                                                                                              INPUT file;
Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             29

          .LAYOUT FILEIN;                                                                     An ALL
               .FIELD Student_ID             *   INTEGER;                                     CHARACTER
               .FIELD Last_Name              *   CHAR (20);
               .FIELD First_Name             *   VARCHAR (12);                                based flat file.
               .FIELD Class_Code             *   CHAR (2);
               .FIELD Grade_Pt               *   DECIMAL(5,2);                                Defines the
                                                                                              structure of the
                                                                                              INPUT file; Notice
                                                                                              the dots before the
                                                                                              FIELD command
                                                                                              and the
                                                                                              semi-colons after
                                                                                              each FIELD
                                                                                              definition;
                                                                                              Names the DML
          /* Begin INSERT and UPDATE Process on Table */                                      Label
          .DML LABEL UPSERTER
                                                                                              Tells MultiLoad to
            DO INSERT FOR MISSING UPDATE ROWS;                                                INSERT a row if
          /* Without the above DO, one of these is guaranteed to                              there is not one to
          fail on this same table. If the UPDATE fails because
          rows is missing, it corrects by doing the INSERT */                                 be UPDATED, i.e.,
                                                                                              UPSERT.
          UPDATE SQL01.Student_Profile
          SET    Last_Name = :Last_Name                                                       Defines the
                ,First_Name = :First_Name                                                     UPDATE.
                ,Class_Code = :Class_Code
                ,Grade_Pt = :Grade_Pt
          WHERE Student_ID = :Student_ID;                                                     Qualifies the
                                                                                              UPDATE.
          INSERT INTO SQL01.Student_Profile
          VALUES (:Student_ID                                                                 Defines the
                 ,:Last_Name
                 ,:First_Name                                                                 INSERT.
                 ,:Class_Code
                 ,:Grade_Pt);                                                                 We recommend
                                                                                              placing comma
                                                                                              separators in front
                                                                                              of the following
                                                                                              column or value
                                                                                              for easier
                                                                                              debugging.
                                                                                              Names the Import
          .IMPORT INFILE CDW_IMPORT.DAT                                                       File and it names
              LAYOUT FILEIN                                                                   the Layout file to
              APPLY UPSERTER;
                                                                                              use and tells
                                                                                              MultiLoad to
                                                                                              APPLY the
                                                                                              UPSERTs.
                                                                                              Ends MultiLoad
          .END MLOAD;                                                                         and logs off of
          .LOGOFF;                                                                            Teradata



Figure 5-16


 What Happens when MultiLoad Finishes
MultiLoad Statistics


Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             30

****08:06:41 UTY1803 Import Processing Statistics
                                  Import 1                                      Total Thus Far
Candidate Records considered . . . 70000                                            70000
Apply conditions satisfied . . . . 70000                                            70000
****08:06:38 UTY0818 Statistics for table Employee_Table
       INSERTS:                      25000
       UPDATES:                      25000
       DELETES:                      0
****08:06:41 UTY0818 Statistics for table Department_Table
       INSERTS:                      0
       UPDATES:                      20000
DELETES:                      0


Figure 5-17


 Troubleshooting Multiload Errors
The output statistics in the above example indicate that the load was entirely successful. But that is
not always the case. Now we need to troubleshoot in order identify the errors and correct them, if
desired. Earlier on, we noted that MultiLoad generates two error tables, the Acquisition Error and
the Application error table. You may select from these tables to discover the problem and research
the issues.

For the most part, the Acquisition error table logs errors that occur during that processing phase.
The Application error table lists Unique Primary Index violations, field overflow errors on non-PI
columns, and constraint errors that occur in the APPLY phase. MultiLoad error tables not only list
the errors they encounter, they also have the capability to STORE those errors. Do you remember
the MARK and IGNORE parameters? This is where they come into play. MARK will ensure that the
error rows, along with some details about the errors are stored in the error table. IGNORE does
neither; it is as if the error never occurred.

                  THREE COLUMNS SPECIFIC TO THE ACQUISITION ERROR TABLE
ErrorCode           System code that identifies the error.
ErrorField          Name of the column in the target table where the error happened; is left blank if
                    the offending column cannot be identified.
HostData            The data row that contains the error.


Figure 5-19




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             31

            THREE COLUMNS SPECIFIC TO THE APPLICATION ERROR TABLE
Uniqueness    Contains a certain value that disallows duplicate row errors in this table; can be
              ignored, if desired.
DBCErrorCode System code that identifies the error.
DBCErrorField Name of the column in the target table where the error happened; is left blank if the
              offending column cannot be identified. NOTE: A copy of the target table column
              immediately follows this column.


Figure 5-20


 RESTARTing Multiload
Who hasn't experienced a failure at some time when attempting a load? Don't take it personally!
Failures can and do occur on the host or Teradata (DBC) for many reasons. MultiLoad has the
impressive ability to RESTART from failures in either environment. In fact, it requires almost no
effort to continue or resubmit the load job. Here are the factors that determine how it works:

First, MultiLoad will check the Restart Logtable and automatically resume the load process from the
last successful CHECKPOINT before the failure occurred. Remember, the Logtable is essential for
restarts. MultiLoad uses neither the Transient Journal nor rollbacks during a failure. That is why you
must designate a Logtable at the beginning of your script. MultiLoad either restarts by itself or waits
for the user to resubmit the job. Then MultiLoad takes over right where it left off.

Second, suppose Teradata experiences a reset while MultiLoad is running. In this case, the host
program will restart MultiLoad after Teradata is back up and running. You do not have to do a thing!

Third, if a host mainframe or network client fails during a MultiLoad, or the job is aborted, you may
simply resubmit the script without changing a thing. MultiLoad will find out where it stopped and start
again from that very spot.

Fourth, if MultiLoad halts during the Application Phase it must be resubmitted and allowed to run
until complete.

Fifth, during the Acquisition Phase the CHECKPOINT (n) you stipulated in the .BEGIN MLOAD
clause will be enacted. The results are stored in the Logtable. During the Application Phase,
CHECKPOINTs are logged each time a data block is successfully written to its target table.

HINT: The default number for CHECKPOINT is 15 minutes, but if you specify the CHECKPOINT as
60 or less, minutes are assumed. If you specify the checkpoint at 61 or above, the number of
records is assumed.



 RELEASE MLOAD: When You DON'T Want to Restart MultiLoad
What if a failure occurs but you do not want to RESTART MultiLoad? Since MultiLoad has already
updated the table headers, it assumes that it still "owns" them. Therefore, it limits access to the
table(s). So what is a user to do? Well there is good news and bad news. The good news is that if
the job you may use the RELEASE MLOAD command to release the locks and rollback the job. The
bad news is that if you have been loading multiple millions of rows, the rollback may take a lot of
time. For this reason, most customers would rather just go ahead and RESTART.

Before V2R3: In the earlier days of Teradata it was NOT possible to use RELEASE MLOAD if one
of the following three conditions was true:

       • In IMPORT mode, once MultiLoad had reached the end of the Acquisition Phase you could
         not use RELEASE MLOAD. This is sometimes referred to as the "point of no return."


Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             32


       • In DELETE mode, the point of no return was when Teradata received the DELETE
         statement.


       • If the job halted in the Apply Phase, you will have to RESTART the job.



With and since V2R3: The advent of V2R3 brought new possibilities with regard to using the
RELEASE MLOAD command. It can NOW be used in the APPLY Phase, if:



       • You are running a Teradata V2R3 or later version


       • You use the correct syntax:
         RELEASE MLOAD <target-table> IN APPLY


       • The load script has NOT been modified in any way


       • The target tables either:
              ♦ Must be empty, or


                ♦ Must have no Fallback, no NUSIs, no Permanent Journals




You should be very cautious using the RELEASE command. It could potentially leave your table half
updated. Therefore, it is handy for a test environment, but please don't become too reliant on it for
production runs. They should be allowed to finish to guarantee data integrity.



 MultiLoad and INMODs
INMODs, or Input Modules, may be called by MultiLoad in either mainframe or LAN environments,
providing the appropriate programming languages are used. INMODs are user written routines
whose purpose is to read data from one or more sources and then convey it to a load utility, here
MultiLoad, for loading into Teradata. They allow MultiLoad to focus solely on loading data by doing
data validation or data conversion before the data is ever touched by MultiLoad. INMODs replace
the normal MVS DDNAME or LAN file name with the following statement:
.IMPORT INMOD=<INMOD-name>

You will find a more detailed discussion on how to write INMODs for MultiLoad in the chapter of this
book titled, "INMOD Processing".



 How Multiload Compares with FastLoad




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition                                             33

                Function                                          FastLoad             MultiLoad
Error Tables must be defined                                        Yes    Optional. 2 Error Tables have to
                                                                            exist for each target table and
                                                                           will automatically be assigned.
Work Tables must be defined                                       No        Optional. 1 Work Table has to
                                                                            exist for each target table and
                                                                           will automatically be assigned.
Logtable must be defined                                          No                      Yes
Allows Referential Integrity                                      No                       No
Allows Unique Secondary Indexes                                   No                       No
Allows Non-Unique Secondary Indexes                               No                      Yes
Allows Triggers                                                   No                       No
Loads a maximum of n number of tables                            One                      Five
DML Statements Supported                                       INSERT       INSERT, UPDATE, DELETE,
                                                                                    and "UPSERT"
DDL Statements Supported                                   CREATE and DROP           DROP TABLE
                                                                TABLE
Transfers data in 64K blocks                                     Yes                      Yes
Number of Phases                                                 Two                      Five
Is RESTARTable                                                   Yes          Yes, in all 5 phases (auto
                                                                                    CHECKPOINT)
Stores UPI Violation Rows                                        Yes                      Yes
Allows use of Aggregated, Arithmetic                              No                      Yes
calculations or Conditional Exponentiation
Allows Data Conversion                                       Yes, 1 per column                             Yes
NULLIF function                                                     Yes                                    Yes


Figure 5-21




Reprinted for ibmkvskumar@in.ibm.com, IBM                  Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited