~~draft~~ Created: June 23, 1993 Last Modified: July19, 1993 DB2 Programming Standards and Guidelines SQL (Structured Query Language) is very flexible and can formulate a given database request in a number of different ways. DB2 in turn may access the data needed to satisfy any given request in any of several ways. Theoretically the DB2 optimizer should be able to choose the most efficient access path, no matter which equivalent SQL statement is used. DB2‟s optimizer has not yet reached this ideal. In most cases, the optimizer makes the best choice no matter how the request is stated. But in a few situations, the programmers‟ decisions in formulating statements can make a difference in performance. This situation will be eliminated over time as IBM software developers improve the optimizer. For now, programmers can improve the performance of their own programs by following a few guidelines for statement formulations. The following tables will be used to show examples of various SQL statements: SUPPLIER SN S1 S2 S3 S4 S5 (S) TABLE SNAME SMITH JONES BLAKE CLARK ADAMS STATUS 20 10 30 20 30 CITY LONDON PARIS PARIS LONDON ATHENS JOB JN J1 J2 J3 J4 J5 J7 (J) TABLE JNAME SORTER PUNCH READER CONSOLE COLLATOR TAPE CITY PARIS ROME ATHENS ATHENS LONDON LONDON PART PN P1 P2 P3 (P) TABLE PNAME COLOR NUT RED BOLT GREEN SCREW BLUE WEIGHT 12 17 17 CITY LONDON PARIS ROME P4 P5 P6 SCREW CAM COG RED BLUE RED 14 12 19 LONDON PARIS LONDON SUPPLIER/PART/JOB SN PN JN S1 P1 J1 S1 P1 J4 S2 P3 J1 S2 P3 J2 S2 P3 J3 S2 P3 J4 S2 P3 J5 S2 P3 J6 S2 P3 J7 S2 P5 J2 S3 P3 J1 S3 P4 J2 S4 P6 J3 S4 P6 J7 S5 P2 J2 S5 P2 J4 S5 P5 J5 S5 P5 J7 S5 P6 J2 S5 P1 J4 S5 P3 J4 S5 P4 J4 S5 P5 J4 S5 P6 J4 (SPJ) TABLE QTY 200 700 400 200 200 500 600 400 800 100 200 500 300 300 200 100 500 100 200 100 200 800 400 500 EFFICIENT SELECTS The purpose of a SELECT statement is to retrieve information from the DB2 tables. The result of this operation will always be a TABLE (consisting of columns and rows selected aka RESULT SET). Maximum number of columns in a SELECT list is 750 Maximum number of tables in the FROM clause is 15 Maximum number of predicates is 300 - Multiple predicates (WHERE clauses) can be specified using the AND and OR operators General form of the SELECT statement: SELECT FROM WHERE GROUP BY HAVING ORDER BY WHERE When selecting columns from tables, do not select columns that are referenced in the SQL statement‟s WHERE clause. Select only those columns whose values are unknown to the program. Do not select columns whose values are already known to the host language program because this results in unnecessary I/O, as in the following query: EXEC SQL SELECT SN, SNAME FROM S WHERE SN = :HOST-VARIABLE-SN END-EXEC Write the above statement as follows: EXEC SQL SELECT SNAME FROM S WHERE SN = :HOST-VARIABLE-SN END-EXEC A SELECT * will return all columns from a table whether or not they are needed for program processing. This statement will result in excessive I/O overhead. Do not use a SELECT * as in the following example: EXEC SQL SELECT * FROM S END-EXEC Code all desired columns explicitly. If a table layout ever changes, for whatever reason, the program using a SELECT * has to be re-compiled and re-bound. the data you want (column names) some source (table name (s)) conditions which are to be met (if any) group data (column name (s)) into one row a condition each group of data must satisfy rows will be in this order (column name (s)) When coding a SQL SELECT statement in a host language program (COBOL), always code the following statement: EXEC SQL before the SQL statement and code END-EXEC after the SQL statement: EXEC SQL SELECT SNAME FROM S WHERE SN = :HOST-VARIABLE-SN END-EXEC Host variables, as in the above statement, are working storage fields in host-language programs that are used to accept data from SQL statements, provide input to SQL update or insert statements, or are used in selecting data from tables. Always code host variables in the working-storage section of your program with the same data type and length as the columns of the table in the SQL statement(s) that they pertain to. CURSORS Host language programs like COBOL are not prepared to accept more than one row at a time when retrieving columns from a DB2 table. A cursor is used to retrieve multiple rows in succession from a SELECT statement. (Since it is a declarative rather than executable statement, it can be placed in the data or procedure division of a COBOL program, but must physically precede the OPEN.) The declaration takes the form DECLARE SCURSOR CURSOR FOR, in which the cursor‟s name follows the word DECLARE. Cursor processing logic must be used in host language programs when you will be retrieving more than one row which satisfies the WHERE clause of an SQL statement. Here is a complete example declaration: EXEC SQL DECLARE SCURSOR CURSOR FOR SELECT SN, SNAME, STATUS, CITY FROM S WHERE STATUS > :OLD-STATUS AND SN > : OLD-SN END-EXEC The program calls on the cursor for processing through two statements: one for opening the cursor, and the other for fetching its rows. The OPEN statement readies the SELECT for execution by creating a cursor set. DB2 attempts to retrieve one qualifying row at a time to pass to the program by the use of the FETCH statement. The OPEN statement simply uses the key word OPEN followed by the cursor‟s name: EXEC SQL OPEN SCURSOR END-EXEC. FETCH The FETCH statement identifies the cursor to be used and employs the INTO clause to indicate the host variables that are to receive the values from each row: EXEC SQL FETCH SCURSOR INTO :SN, :SNAME, :STATUS, :CITY END-EXEC As with a SELECT INTO statement that is used when a single row is returned, the variables used in a FETCH INTO statement must be of compatible data type and length with the columns that supply their values. Host languages are not prepared to accept more than one row at a time Declare cursor must come before the OPEN A SELECT statement outperforms the cursor select for retrieving a single row OPEN readies the SELECT for execution DB2 will try to fetch rows one at a time as requested If now rows qualify from a cursor - Open results in SQLCODE = 0 - First Fetch results in SQLCODE = 100 Multiple cursors can be opened on one or more tables Always check the SQLCODE after opening and closing a cursor. If the SQLCODE after the execution of an OPEN CURSOR or CLOSE CURSOR is less than zero, abend processing should occur - The following example illustrates a processing loop, which is how cursors are most often used: EXEC SQL DECLARE SCURSOR SELECT SN, SNAME, STATUS, CITY FROM S WHERE STATUS = :NEW-STATUS END-EXEC OPEN-SCURSOR. EXEC SQL OPEN SCURSOR END-EXEC. PERFORM GET-ROW UNTIL SQLCODE = 100 GET-ROW. EXEC SQL FETCH SCURSOR INTO :SN, :SNAME, :STATUS, :CITY END-EXEC. IF SQLCODE = 100 MOVE SAVE-STATUS TO NEW-STATUS EXEC SQL CLOSE SCURSOR END-EXEC. When checking a DB2 table for straight existence checking (no data retrieval is necessary for program processing), it is more efficient to use a select literal SQL statement versus a select count (*), or a statement that actually retrieves data from the DB2 table. For example, if you wanted to verify the existence of a participant social security number, instead of using a standard select statement: EXEC SQL SELECT SN INTO :SN FROM S WHERE SN = :HOST-VARIABLE-SN END-EXEC use the following: EXEC SQL SELECT „SUPPLIER VERIFICATION‟ INTO :SUPPLIER-LITERAL FROM S WHERE SN = :HOST-VARIABLE-SN END-EXEC. SUPPLIER-LITERAL The field SUPPLIER-LITERAL is defined in the working-storage section of the host program in the following manner: 01 SUPPLIER-LITERAL PIC X(21) If it is possible that more than one row could satisfy the where clause in an existence check, interrogate the SQLCODE returned by the SQL call for a value of –811. If the SQLCODE is equal to – 811, it means that there is more than one row that satisfies the where clause of the SQL call. DISTINCT The DISTINCT operator may be used to eliminate duplicates from the result set. Do not use DISTINCT unless absolutely necessary for program processing. DISTINCT requires a sort which may degrade performance. EXEC SQL SELECT DISTINCT CITY FROM PART END-EXEC Returns: London Paris Rome ORDER BY The ORDER BY is used to order the result set of a select statement. For example, the following statement could be used to select supplier numbers and names from the S table and order the result by the city column: EXEC SQL SELECT SN, CITY FROM S ORDER BY CITY END-EXEC RETURNS: S5 S1 ATHENS LONDON S4 S2 S3 LONDON PARIS PARIS Each column in the ORDER BY clause must identify a column of the result table. ORDER BY columns may also be identified by the position of the column in the result table. The ORDER BY will sort the result set in ascend9ing order by default. You may also request that the result be returned in descending order. The following SELECT statement is an example of this and will return the same results as the above query except that the result will be in descending order: EXEC SQL SELECT SN, CITY FROM S ORDER BY CITY DESC END-EXEC RETURNS: S2 S3 S1 S4 S5 GROUP BY Use of the GROUP BY conceptually rearranges the table represented by the FROM clause into partitions, such that within any one group, all rows have the same value for the GROUP BY field. GROUP BY does not imply ORDER BY. But GROUP BY will perform a sort to group the result set. The column(s) specified by the GROUP BY must be in the select clause of the SQL call. The following query presents an example of the use of the GROUP BY: EXEC SQL SELECT SN, SUM (QTY) FROM SPJ GROUP BY SN END-EXEC RETURNS: PARIS PARIS LONDON LONDON ATHENS S1 S2 S3 S4 S5 HAVING 900 3200 700 600 3100 The HAVING clause is used with the GROUP BY to eliminate unwanted groups. The HAVING operates on the result set created by the GROUP BY. The HAVING will not work without a GROUP BY clause: EXEC SQL SELECT SN, SUM (QTY) FROM SPJ GROUP BY SN HAVING COUNT (*) > 2 END-EXEC RETURNS: S2 S3 S4 S5 LIKE The LIKE operator can be used as a “wildcard” SELECT to retrieve rows based on only a fraction of the actual column value. With a LIKE you can retrieve rows when you only know a portion of the column value. The following SQL select will retrieve rows where the first character in the CITY column is equal to “L”. The percent sign is used to denote a wildcard, meaning that any values that come after the “L” in the SELECT are valid and satisfy the WHERE clause. EXEC SQL SELECT SN, CITY FROM S WHERE CITY LIKE „L%‟ END-EXEC RETURNS: S1 S4 LONDON LONDON 3200 700 600 3100 LIKE Operator DB2 will not use a matching index scan to search for a character string if the leading character is not specified in the LIKE clause. If the leading character or characters are specified, it may use a matching scan to locate values with those beginnings in the index and a nonmatching scan to complete the search for strings with those beginnings and any other characters specified in the clause. UNION The UNION statement is used to bring together rows from multiple selects: Set of all elements belonging to either or both of the original result sets. Must have the same number of columns selected Selected column data types must be compatible Redundant duplicate rows are always eliminated from the results unless UNION ALL is specified. Any number of SELECT statements can be UNIONed together ORDER BY clause, if used must appear in the final SELECT statement only. References columns by number, not name. The number refers to the order in which the columns were selected. - Query involving UNION: EXEC SQL SELECT PN FROM PART WHERE COLOR = „RED‟ UNION SELECT PN FROM SPJ WHERE QTY > 100 ORDER BY 1 END-EXEC RETURNS: P1 P2 P3 P4 P5 P6 Do Not Use ORDER BY, GROUP BY, DISTINCT, or UNION unless absolutely necessary for program processing. These predicates cause DB2 sorting which may degrade performance. AGGREGATES DB2 supports the use of various aggregate functions in SELECT statements: AVG average of the values in a column SUM sum of the values in a column MIN smallest value in a column MAX largest value in a column COUNT number of values in a column For COUNT, DISTINCT must be specified for anything but COUNT (*) Here is an example of a select statement to count the number of unique cities that supply parts: EXEC SQL SELECT COUNT (DISTINCT CITY) FROM PART END-EXEC RETURNS: 3 DISTINCT is not applicable with MAX, MIN, AVG, SUM AGGREGATE function in the SELECT clause: EXEC SQL SELECT COUNT (*) FROM S END-EXEC RETURNS: 5 EXEC SQL SELECT MIN (QTY), AVG (QTY), MAX (QTY) FROM SPJ WHERE PN = „P1‟ END-EXEC RETURNS: MIN (QTY) 100 AVG (QTY) 333 MAX (QTY) 700 AGGREGATE function in the SELECT clause with DISTINCT: EXEC SQL SELECT COUNT (DISTINCT SN) FROM SPJ END-EXEC RETURNS: 5 If the result of an aggregate is an empty set (no rows satisfy the WHERE clause), COUNT returns a value of zero with an SQLCODE of zero. The other aggregates will field(s). To avoid this, code a null indicator variable in the SQL select statement: EXEC SQL SELECT MIN (QTY) INTO :RESULT-FIELD NULL-INDICATOR FROM SPJ WHERE PN = „P7‟ END-EXEC This will cause a SQLCODE of 100 to be returned, and the NULL-INDICATOR value will be set to a value of less than zero (usually – 1), if no rows satisfy the WHERE clause. If SQLCODE is equal to zero that means that a row, or multiple rows were found that satisfy the where clause and the result field has a valid value in it. If a null has been returned, you should initialize your host variable that was to receive the result of the aggregate to remove the “garbage data” that may be in it. The format of the SQL statement and the SQL and NULL-INDICATOR checking that may follow should be coded like this: EXEC SQL SELECT MIN (QTY) INTO :RESULT-FIELD, NULL-INDICATOR FROM SPJ WHERE PN = „P7‟ END-EXEC IF NULL-INDICATOR < 0 MOVE ZEROS TO RESULT-FIELD. IF SQLCODE = 0 DO WHATEVER ELSE IF SQLCODE = 100 DO WHATEVER ELSE IF SQLCODE NOT = 0 AND 100 GO TO ABEND ROUTINE. EFFICIENT JOINS Most joins have equivalent correlated subselects. A correlated subselect is one in which the outer select identifies a table from which SQL draws values for use in the subselect. Joins are preferred over correlated subselects. A join‟s where clause can reduce the number of rows participating in the join. When processing a correlated subselect, on the other hand, DB2 must evaluate all rows from one table against all rows in the other. In some cases DB2 cannot use an index in satisfying correlated subselects where it would be able to use the index in carrying out the equivalent join if the join predicates are indexed columns. EXEC SQL SELECT JN, PNAME, WEIGHT, QTY FROM P, SPJ WHERE P.PN = SPJ.PN AND P.PN = „P3‟ END-EXEC Logically, it makes no difference whether the statement specifies that the P3 rows be selected from the SPJ or P table. The results will be the same. From a performance standpoint, however, it is usually best to narrow the search on the larger table. Consequently, if the user or programmer knows which table in the join is larger, he or she should use it in the statements to narrow the search. An alternative would be to specify both tables in the WHERE clause – for example: EXEC SQL SELECT JN, PN, PNAME, WEIGHT, QTY FROM P, SPJ WHERE P.PN = SPJ.PN AND P.PN = „P3‟ END-EXEC Guidelines for efficient joins: Narrow the search to only those rows required for the join. Narrow the search on both tables to be joined to five the optimizer more information on which to base its decisions. Do not use correlated subselects with IN or NOT IN operators. A correlated subselect with the EXISTS or NOT EXISTS operator should be used instead. The EXISTS and NOT EXISTS operators will usually perform as well as a JOIN. In most cases, DB2 will not use an index when searching a table specified in an IN or NOT IN subselect. - - The following example explains why correlated subselects with IN or NOT IN operators should not be used in host programs: Consider a search for the names of suppliers of part P5. One way to accomplish that search would be with a statement that causes DB2 to select P5 rows from the SPJtable and join that set on supplier number with the S table. The following statement will do that job: EXEC SQL SELECT SNAME FROM S, SPJ WHERE S.SN = SPJ.SN AND S.SN = „P5‟ END-EXEC Since only four P5 rows will be joined with the S table, the procedure requires at most, reads of only four rows in the S table. Another way to carry out this search would be to perform a correlated subselect. The equivalent statement would be: EXEC SQL SELECT SNAME FROM S WHERE „P5‟ IN (SELECT PN FROM SPJ WHERE SPJ.SN = S.SN) END-EXEC In this method DB2 first satisfies the inner select (in parentheses) by evaluating each row of the S table in turn against all the rows in the SPJ table, searching for a match on supplier number. The outer SELECT returns the supplier name any time the returned set includes P5. The entire SPJ table must be read once for each row in the S table, although searching the SPJ table just once for p5 rows is clearly more efficient. EXISTS and NOT EXISTS The EXISTS and NOT EXISTS operators return a true or false value to the outer select depending on the evaluation of the subselect with which they are used. For EXISTS, if there is a value satisfying the subselect, the operator returns true. For NOT EXISTS, if there is no value satisfying the subselect, the operator returns true. The outer select includes a row whenever the subselect returns a true evaluation. Any query that can be expressed using IN, can be formed using EXISTS. However, the converse is not true. A select to determine the cities that have both suppliers and part stocks represents the intersection of the two sets of city values from the S and P tables. This statement is used to determine that intersection: EXEC SQL SELECT DISTINCT CITY FROM S WHERE EXISTS (SELECT CITY FROM P WHERE P.CITY = S.CITY) END-EXEC RETURNS: CITY: LONDON PARIS Here is an example of the above query written as a correlated subselect without exists: EXEC SQL SELECT DISTINCT CITY FROM S WHERE CITY IN (SELECT CITY FROM P WHERE P.CITY = S.CITY) END-EXEC. RETURNS: CITY: LONDON PARIS DB2 performs the search by passing the first city from S to the subselect, which finds a match for LONDON and therefore returns a true evaluation. The outer select therefore includes the row, returning LONDON. Then it continues the process for each subsequent row. Similarly a search to determine whether there are any cities with parts but no suppliers, found by subtracting the cities in S from those in P, would represent a defference. The statement to determine this difference is: EXEC SQL SELECT DISTINCT CITY FROM P WHERE NOT EXISTS (SELECT CITY FROM S WHERE CITY = P.CITY) END-EXEC. RETURNS: CITY: ROME Let‟s look at another statement that can be written as join, or with an EXISTS operator: EXEC SQL SELECT SN, PN, JN, QTY FROM SPJ SPJX WHERE JN = „J2‟ AND EXISTS (SELECT S.SN FROM S WHERE S.SN = SPJX.SN AND CITY = „PARIS‟) END-EXEC Here is an example of the previous query written as a join: EXEC SQL SELECT SN, PN, JN, QTY FROM S, SPJ WHERE S.SN = SPJ.SN AND CITY = „PARIS‟) AND JN = „J2‟) END-EXEC UPDATES The purpose of the UPDATE statement is to update the values of one or more columns in one or more rows of a table. The rows to be updated are chosen by the “WHERE CLAUSE”. The basic format of the UPDATE statement is: UPDATE SET WHERE the table (table name) column name 1 = expression 1 column name 2 = expression 2 conditions to be met (if any) If no WHERE CLAUSE is specified, all rows in the table will be updated Because updates modify the data and associated indexes during the update process, use of indexes to find the rows to be changed presents several problems. DB2 will not use certain indexes for certain types of updates and in some cases will fail to take advantage of usable indexes if the UPDATE statement is improperly worded. DB2 will not use an index to search a column being updated. For example, to change supplier S5‟s supplier number to S6, we could use this statement: EXEC SQL UPDATE S SET SN = „S6‟ WHERE SN = „S5‟ END-EXEC This statement would not take advantage of an index on SN. Using SNAME to identify the rows for updating would require this equivalent statement: EXEC SQL UPDATE S SET SN = „S6‟ WHERE SNAME = „ADAMS‟ END-EXEC This statement would take advantage of the index on SNAME. In some cases, however, updates or deletions may be done in conjunction with other processing that would require a cursor. If the rows returned with a cursor are to be updated or deleted, the cursor definition may indicate this. A FOR UPDATE OF clause alerts DB2 that the update is coming. This warning prevents DB2 from using an index on a column to be updated. If it were to use an index on a column being updated, some updates would not be applied after the initial update since the index itself would be changed by the update. For example, if an index on the STATUS column of the s table were used to retrieve rows, those rows would be returned in STATUS order. An update of status from 10 to 50 would cause the cursor position to skip to a STATUS of 50, missing 20, 30, 40. If equal predicates are used on the updated column and on all preceding columns, an index can be used to locate the row to be updated. If the program to reassign parts over a certain weight from one city to another employed a cursor, its declaration would look something like this: EXEC SQL DECLARE PCURSOR FOR SELECT PN, WEIGHT FROM P WHERE WEIGHT > :WEIGHT-LIMIT FOR UPDATE OF CITY END-EXEC. Notice that the column being ukpdated need not appear in the SELECT clause, although it often does. When a cursor is used for updates, the UPDATE statement must use a WHERE CURRENT OF CURSOR clause to indicate the rows to be processed for example: EXEC SQL UPDATE P SET CITY = :NEW-CITY WHERE CURRENT OF PCURSOR END-EXEC. The WHERE CURRENT OF clause causes the row at which the cursor points to be updated. The update does not advance the cursor. A FETCH . . . INTO statement is necessary for that. The FOR UPDATE OF cannot be specified if the SELECT has: ORDER BY, GROUP BY, UNION, DISTINCT AVG, SUM, COUNT, MIN, MAX, SUM JOIN A correlated subselect All of the above functions create temporary internal tables, that cannot be updated since the base tables may be changed independently, creating inconsistencies in the data. DB2 automatically bars the use of an index on any column referenced in the FOR UPDATE OF clause of cursor declaration. In performing the statement‟s search, regardless of whether the reference columns are actually updated. Reference only those columns that are likely to be updated. Updating an entire set of rows with an update statement is more efficient than using the SELECT and cursor operation to update a row at a time. USING INDEXES WITH OR CLAUSES: The optimizer may use indexes in searches that include „OR‟ CLAUSES. There are exceptions, however. The optimizer will not use an index if the search criteria depends on a comparison other than equality, as with the following statement: EXEC SQL SELECT SN, SNAME FROM S WHERE SN < „S2‟ OR SN > „S4‟ END-EXEC DB2 will not use an index on supplier number, but an equivalent statement using UNION ALL would: EXEC SQL SELECT SN, SNAME FROM S WHERE SN < „S2‟ UNION ALL SELECT SN, SNAME FROM S WHERE SN > „S4‟ END-EXEC NEGATIVE LOGIC CLAUSES: DB2 will usually not use indexes in performing searches with predicates that include a negation. The not-greater-than or not-less-than conditions are different. The optimizer can reverse the negative and use an index on the positive. The optimizer can evaluate the clause WHERE NOT STATUS > 20 as WHERE STATUS <= 20 AND USE AN INDEX ON status IN THE SEARCH. The optimizer will not use an index on a search based on the NOT BETWEEN operator. BETWEEN If the optimizer knows that a search covers a range it will likely choose a nonmatching index scan. Between is more efficient than using IN. The following three queries will return the same results, but the query using BETWEEN is more efficient than using IN, or a combination of >=, ,<= EXEC SQL SELECT SN, PN, JN FROM SPJ WHERE QTY BETWEEN 100 and 300 END-EXEC EXEC SQL SELECT SN, PN, JN FROM SPJ WHERE QTY IN(100,200,300) END-EXEC EXEC SQL SELECT SN, PN, JN FROM SPJ WHERE QTY >=100 and <= 300) END-EXEC RETURNS: S1 S2 S2 P1 P3 P3 J1 J2 J3 S2 S3 S4 S4 S5 S5 S5 S5 S5 S5 P5 P3 P6 P6 P2 P2 P5 P6 P1 P3 J2 J1 J3 J7 J2 J4 J7 J2 J4 J4 PROGRAMMING FOR PROGRAMS THAT ARE RE-STARTABLE When coding a DB2 program with restart capabilities a DB2 table is often used as a “driver” to determine processing sequence and an ORDER BY is often used for re-start positioning. When using a DB2 table as a driver, order the retrieved rows by the clustering index. For example, if you are using participant social security numbers from TBL_EE_PART as a processing driver you would want to select the column PART_SEARCH_NAM from TBL_EE_PART. This column contains the participants name and social security number. For an example of how to use this column as a processing driver, how to extract the social security number from it, and how to use this column for re-starting purposes look at program TR99ET06 in dataset: ADM.TR16101.TESTLIB. LOCKING When coding a batch program there are ways to improve efficiency by explicitly specifying the way your program will lock tables that you are accessing. Locking refers to the way that DB2 secures tables is only SELECTing columns from a DB2 table, you should code a LOCK TABLE statement in the initialization routine of your program as follows: EXEC SQL LOCK TABLE S IN SHARE MODE END-EXEC. Locking a table in SHARE mode will allow other plans to read rows from the table, but will not allow any updates to it. If your program is going to be updating or inserting rows on a DB2 table, you should LOCK the table in exclusive mode so that no other plans may access it. You code this LOCK table statement in the following way: EXEC SQL LOCK TABLE SPJ IN EXCLUSIVE MODE END-EXEC. Locking a table in exclusive mode will prevent any other plan from accessing this table, even SELECT access. SUMMARY OF SEARCHES THAT CAN USE AN INDEX: DB2 may use indexes to satisfy searches called for by: - Where clauses using the comparators =, <, <=, >, >=, ^> and ^<. - For example, WHERE SN = 10. Subselects using those comparators. Predicates using AND to limit the search criteria. Predicates using OR to expand the search criteria if the two clauses refer to the same column and both search on an equality. Predicates using IN or BETWEEN search criteria. Predicates seeking null values, for example WHERE SN IS NULL. Predicates seeking LIKE character values. If the search value begins with a blank or wild card - WHERE LOCATION LIKE „%BURG,‟ for example – DB2 can use a nonmatching index scan. If the search value begins with an alphanumeric value, it can use a matching index scan. Predicates that search on a value determined by a „NOT‟ operator negation of a less-than or greater-than comparator – for example, NOT SN > 50. A WHERE clause limiting a search on the first column or columns of a composite index, in which case a matching scan can be used; if the columns used in the predicate do not include the first column of the composite, a nonmatching scan may be used. Predicates requiring ORDER BY or GROUP BY operations on clustered columns. EXISTS and NOT EXISTS operators can use a matching index scan. The aggregates AVG, SUM, MIN, MAX, and COUNT – can use a matching or nonmatching index scan. - - - - - SUMMARY OF SEARCHES THAT CANNOT USE AN INDEX: A predicate using the „OR‟ operator to expand search criteria to more than one column or a predicate using the „OR‟ operator to include in a search criteria determined by the < and > operators. A correlated subselect. A subselect comparison based on the „IN‟, „ALL‟, „ANY‟ operators, though an index can be used to satisfy the subselect itself. An UPDATE statement that specifies a column specified in the WHERE condition. A join of a table to itself when aliases are not used. Predicates that determine search criteria by negating an equality WHERE NOT SN = 10, for example. A comparison using an arithmetic expression – WHERE QTY = :QTY/2, for example. - - - - INDEX USAGE IN DB2 Indexed Column is . . . . DB2 Considers Index Sort Used with: IN YES LIKE 1 BETWEEN YES >, >=, <, <=, ^<, ^> YES ^= NO NOT 2 OR 3 ORDER BY YES YES GROUP BY YES YES DISTINCT YES YES Compared to: LONGER LETERAL NO MATH. EXPRESSION DIFF. DATA TYPE USED FOR: JOINS YES YES 1 – Yes, except when comparison string is host variable or begins with ‘%’ or ‘_’ 2 – Apply NOT, then evaluate. 3 – No, except when convertible to an ‘IN’ This document has shown the format of basic SQL statements and how they may have an impact on DB2 performance, and pointed out requests that will and will not allow DB2 to use indexes. It also showed how to formulate requests so that DB2 will be most likely to determine the most efficient access paths. I have also included some guidelines for DB2 programs that are re-startable. If you have any questions or comments, please see me.