"IT Presentation Template"
Understanding Oracle Execution Plans for Oracle Developers January 24, 2007 1 Oracle Execution Plans Agenda Introduction The Oracle Optimizer Database Statistics Data Access Paths Table Join Methods Execution Plans Optimizer Hints Q&A 2 Introduction Goals Understand how Oracle processes queries Learn the various methods used by the Oracle optimizer and when they are appropriate Understand execution plans Audience: Developers and analysts who write SQL 3 Introduction How do we write high performance SQL? If you want to make your query go faster, you need to know what it is doing! 4 The Oracle Optimizer A SQL statement can be executed in many different ways: Data Access Paths Full Table Scans Index Scans Join Methods Nested Loops Joins Hash Joins Sort-Merge Joins 5 The Oracle Optimizer The Optimizer determines the most efficient way of producing the result of the query using: Internal rules (RBO) Costing methods and statistics (CBO) Optimization goals: Throughput (ALL_ROWS) Response time (FIRST_ROWS_n) 6 The Oracle Optimizer Optimizer decisions: Can the statement be rewritten to utilize more options? What is the optimization goal? How should the data be retrieved? Which table should be accessed first? How should the tables be joined? Oracle makes these decisions based on the COST 7 The Oracle Optimizer What is the cost? The cost is a relative measure of the query’s expected resource utilization Cost = expected execution time / single block read time The cost is determined by: System statistics (CPU speed, I/O rates) Selectivity / Cardinality (based on table / index statistics) Internal defaults / assumptions 8 Statistics Statistics: Data distribution and storage characteristics of tables, columns, indexes, and partitions The optimizer uses these statistics to: Calculate predicate selectivity and cardinality Estimate amount of I/O (blocks) required to get the data Estimate the cost of each operation The statistics are gathered and stored in the data dictionary. 9 Statistics Selectivity: The fraction of rows in a table that the SQL statement's predicate chooses The optimizer uses the selectivity of a predicate to estimate the cost of a particular access method and to determine the optimal join order and join method. 10 Statistics Statistics generated include: Table statistics Number of rows Number of blocks Average row length Column statistics Number of distinct values in column Number of nulls in column Optionally: data distribution (histograms) Index statistics Number of leaf blocks Levels Clustering factor System statistics I/O performance and utilization CPU performance and utilization 11 The Oracle Optimizer 12 Data Access Paths Data Access Paths: Rowid Lookup Fastest way to access a single record but not very flexible Full Table (or Partition) Scan Index Scan 13 Full Table Scans In a full table scan, Oracle: Reads all rows from the table (or partition) Examines each row to determine whether it satisfies the WHERE clause A full table scan is used when: The table has no applicable index The table is very small The query has low (poor) selectivity (i.e. more than 10% of the data will be returned) 14 Full Table Scans Pro Uses multiblock reads which can mean less total I/O Cons Large tables will be very slow Higher CPU use 15 Index Scans Index Scans: Retrieve data from an index based on the value of one or more columns in the index Each index entry contains a rowid which points to a row in the table If the query accesses only columns in the index, Oracle may be able to avoid a table visit 16 Index Scans 17 Index Overhead Indexes have a performance overhead: Using an index to access a table means at least 3 (and maybe more) blocks must be read for each row The index must be maintained by all DML (inerts, updates, and deletes) against the underlying table 18 Index Scans Several types of index scans: Index Unique Scan Index Range Scan Index Range Scan Descending Index Skip Scan Full Scan Fast Full Scan 19 Index Scans When are the various index scans used? Unique Scan All columns of a unique index are specified with equality conditions in the WHERE clause Fastest index scan Range Scan Leading column of the index is specified in conditions in the WHERE clause Can be used to sort results when leading index columns are specified in ORDER BY clause Range Scan Descending Same as Range Scan but used when index columns are specified in ORDER BY DESC clause 20 Index Scans When are the various index scans used? Skip Scan Non-leading columns of the index are specified in conditions and a FTS is not desirable Not very common Full Scan Can avoid a table visit if all columns in the query are in the index and at least one is not null Can be used to eliminate a sort operation Fast Full Scan Same as Full Scan but read using multi-block reads (very fast) Cannot eliminate a sort 21 Table Join Methods Join Methods: Nested Loops Join Hash Join Sort-Merge Join How does the optimizer choose a join method? It estimates the COST of each join method and chooses the method with the least cost. 22 Nested Loops Join Nested Loops Join The optimizer determines the driving table (or row source) and designates it as the outer table The other table is designated as the inner table For every row in the outer table, Oracle accesses all the rows in the inner table (preferably by an index) The outer loop is for every row in outer table and the inner loop is for every row in the inner table The outer loop appears before (above) the inner loop in the execution plan: 23 Nested Loops Join A nested loop join is used when: Joining a relatively small number of rows A good driving condition exists between the two tables The order of tables in the execution plan is important in a nested loop join Nested loop joins are very efficient for queries with restrictive conditions They are very inefficient for relatively large row sources 24 Hash Join Hash Join Used for joining large data sets The optimizer uses the smaller of two tables or data sources to build a hash table on the join key in memory It then scans the larger table, probing the hash table to find the joined rows A hash join is used when: Tables are joined using an equijoin Joining either a large amount of data or a large proportion of the table 25 Hash Join Hash joins can be very fast if the smaller table can fit entirely into available memory (otherwise parts will be swapped back and forth from disk) Typically used with Full Table Scans Usually slower than nested loops for small rowsets Can be a good idea when optimizing for throughput (ALL_ROWS) Usually a bad idea when optimizing for response time (FIRST_ROWS_n) 26 Sort-Merge Join Sort-Merge Join Consists of two operations: Sort: Both of the input rowsets are sorted on the join key Merge: The sorted lists are merged together A sort-merge join is used when: Joining data from two independent sources Joining either a large amount of data or a large proportion of the table with a non-equijoin (i.e. a.col1 <= b.col1) 27 Sort-Merge Join Sort-merge joins are usually slower than hash joins due to the sorting requirement. It can be faster if a sort operation does not have to be done (i.e. data is already sorted) Faster when the entire sort can fit in memory Commonly used with Full Table Scans Can be a good idea when optimizing for throughput (ALL_ROWS) Usually a bad idea when optimizing for response time (FIRST_ROWS_n) 28 Join Considerations Nested Loop Join Inefficient when a join returns a large number of rows Works very well with index lookups and high selectivity Low initial (startup) cost, does not scale well More common in OLTP applications Hash Join Usually the most efficient join for large number of rows High initial (startup) cost, scales very well More common in batch / reporting applications Sort-Merge Only choice for non-equijoin of large tables Usually slower than Hash join but can be efficient if sorting must be done anyway (or is already done) 29 Join Order Join order What is the maximum number of tables Oracle can join at once? Two! To join more than two tables, Oracle: Joins two of the tables Joins the resulting row source to the next table Repeats until all tables are joined into the result 30 Join Order The join order is one of the most critical descisions the optimizer makes It is very important for performance! Why? Consider the records clerk Principle: We want to reduce the result set as much as possible as early as possible So almost always, we want to start with the table with the most restrictive condition in the WHERE clause Less work = faster queries How does Oracle determine which table has the most restrictive condition? Statistics. 31 Execution Plans Execution plan Visual, step by step representation of how Oracle is processing your query Shows the Access Paths, Join Methods, Join Order, Cost of each operation, and other information. Read from inside out, top to bottom 32 Execution Plans select * from dept where rowid = ':x'; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=18) 1 0 TABLE ACCESS (BY USER ROWID) OF 'DEPT' (Cost=1 Card=1 Bytes=18) select empno, ename from emp where empno=10; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=19) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (Cost=1 Card=1 Bytes=19) 2 1 INDEX (RANGE SCAN) OF 'EMP_PK' (UNIQUE) (Cost=1 Card=1) 33 Execution Plans select * from brokerdb.client; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=64611 Card=10781103 Bytes=4431033333) 1 0 PARTITION LIST (ALL) 2 1 TABLE ACCESS (FULL) OF 'CLIENT' (Cost=64611 Card=10781103 Bytes=4431033333) select * from brokerdb.client where cl_br_branch_num = 18 and cl_acct_num = '12345'; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=411) 1 0 TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'CLIENT' (Cost=3 Card=1 Bytes=411) 2 1 INDEX (UNIQUE SCAN) OF 'CLIENT_PK_IDX' (UNIQUE) (Cost=2 Card=1) 34 Execution Plans select ename,dname from emp, dept where emp.deptno=dept.deptno and dept.dname in ('ACCOUNTING','RESEARCH','SALES','OPERATIONS'); Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=8 Bytes=248) 1 0 HASH JOIN (Cost=3 Card=8 Bytes=248) 2 1 TABLE ACCESS (FULL) OF 'DEPT' (Cost=1 Card=3 Bytes=36) 3 1 TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=16 Bytes=304) 35 Execution Plans Generating Execution Plans SQL*Plus: Set autotrace traceonly explain; Execute the query – it will be explained SQL*Plus: EXPLAIN PLAN [SET STATEMENT_ID = 'text'] FOR statement; select * from table( dbms_xplan.display( plan_table,statement_id )); OR just: select * from table( dbms_xplan.display ); TOAD AQT 36 Optimizer Hints Optimizer Hints Hints bias Oracle to executing a query the way you want it to. There are hints for optimizer goals, access paths, join methods, and join order. ONLY USE HINTS WHEN: You know the data very well You know Oracle is not choosing the best approach You know the correct approach You have tested thoroughly 37 Closing Tips: Know your data Know your query logic – what are you trying to accomplish Try to determine the optimal execution plan Use hints with caution Tip your DBA 38 Q&A QUESTIONS? 39 Future Topics You call the shots! Contact me with suggestions for future presentations Potential Future Topics Using TOAD Advanced SQL tuning Database modeling for performance Small group Interactive SQL tuning Database Fundamentals New 10G features Advanced analytic functions Database tools 40