stats

					STATISTICS ON PARTITIONED OBJECTS
ABSTRACT

Optimal SQL execution plans are rooted in the quality of the statistics describing the underlying objects that are used by
Oracle’s Cost Based Optimizer. Statistics maintenance is a particular challenge on Data Warehouse systems holding large data
volumes that support complex and flexible user reporting requirements. Such systems are likely to use partitioned tables and
indexes to complicate the picture further.

This article is based on the problems encountered and solutions used during the implementation of several large DW systems
running on Oracle 10.2 with different workload profiles and requirements and will discuss:

CORE CHALLENGES AND STATISTICS FUNDAMENTALS
        The quality/performance trade off
        Statistic levels on partitioned objects and how they are used
        Key DBMS_STATS parameters and defaults

GLOBAL STATISTICS
        Global and Approximate Global statistics
        Statistics aggregation scenarios and pitfalls

ALTERNATIVE STRATEGIES
        Setting statistics
        Copying statistics
        Dynamic Sampling

11G IMPROVEMENTS
        Incremental Statistics


INTRODUCTION

The Oracle Cost-Based Optimiser (CBO) evaluates many possible execution plans using a combination of rules and statistics to
determine the most efficient execution plan for each SQL statement. The rules are embedded in the RDBMS code and largely
beyond your control, but the statistics are specific to your system and data and there are various options available to you when
collecting or manipulating those statistics. As the optimiser relies heavily on database object statistics to calculate the estimated
row source cardinalities that are one of the primary influences on optimal execution plans, it follows that paying close attention
to your object statistics will pay long-term dividends in system performance.

Even when dealing with the most basic example of collecting statistics on non-partitioned tables and indexes, the options are
sufficiently complex to make statistics management challenging. To give a few examples (none of which this paper will discuss)
:-



3 March 2012                                                     1                           Doug Burns – Hotsos Symposium 2011
          How often should we update statistics?
          What percentage of the data should Oracle sample?
          Should we gather Histograms on columns?

Over recent versions, Oracle has worked towards providing automatic statistics collection routines and sensible default
parameter values to try to make these decisions for you automatically. However, even in cases where the defaults work well for
your system, it’s useful to know how they work because you are likely to come across other systems where the defaults aren’t
the correct solution. The fact that the options change regularly from release to release indicates that statistics management is a
problem that is far from solved.

Data Warehouses (DW) and other Very Large Databases (VLDB) would be practically impossible to manage without the
introduction of Partitioning in Oracle 8 because of the high cost of accessing large tables and indexes completely, rather than
just a subset of partitions representing a fraction of the overall object. However, once you start to use Partitioned Tables and
Indexes you will encounter additional levels of complexity which this paper will focus on. Consider a simple partitioned table
and index -

Listing 1 – Create Partitioned Table and Index and INSERT initial rows (INIT.SQL)

           CREATE TABLE TEST_TAB1
           (
             REPORTING_DATE            NUMBER              NOT NULL,
             SOURCE_SYSTEM             VARCHAR2(30 CHAR)   NOT NULL,
             SEQ_ID                    NUMBER              NOT NULL,
             STATUS                    VARCHAR2(1 CHAR)    NOT NULL
           )
           PARTITION BY RANGE (REPORTING_DATE)
           SUBPARTITION BY LIST (SOURCE_SYSTEM)
           SUBPARTITION TEMPLATE
             (SUBPARTITION MOSCOW VALUES ('MOSCOW') TABLESPACE TEST_DATA,
              SUBPARTITION LONDON VALUES ('LONDON') TABLESPACE TEST_DATA,
              SUBPARTITION SYDNEY VALUES ('SYDNEY') TABLESPACE TEST_DATA,
              SUBPARTITION OTHERS VALUES (DEFAULT)    TABLESPACE TEST_DATA)
           (
             PARTITION P_20110131 VALUES LESS THAN (20110201) NOLOGGING NOCOMPRESS,
             PARTITION P_20110201 VALUES LESS THAN (20110202) NOLOGGING NOCOMPRESS,
             PARTITION P_20110202 VALUES LESS THAN (20110203) NOLOGGING NOCOMPRESS,
             PARTITION P_20110203 VALUES LESS THAN (20110204) NOLOGGING NOCOMPRESS,
             PARTITION P_20110204 VALUES LESS THAN (20110205) NOLOGGING NOCOMPRESS,
             PARTITION P_20110205 VALUES LESS THAN (20110206) NOLOGGING NOCOMPRESS,
             PARTITION P_20110206 VALUES LESS THAN (20110207) NOLOGGING NOCOMPRESS,
             PARTITION P_20110207 VALUES LESS THAN (20110208) NOLOGGING NOCOMPRESS
           );

           CREATE UNIQUE INDEX TEST_TAB1_IX1 ON TEST_TAB1
           (REPORTING_DATE, SOURCE_SYSTEM, SEQ_ID)
             LOCAL NOPARALLEL COMPRESS 1;

           INSERT    INTO   TEST_TAB1      VALUES    (20110201,      'MOSCOW',      1000, 'P');
           INSERT    INTO   TEST_TAB1      VALUES    (20110202,      'MOSCOW',      30000, 'P');
           INSERT    INTO   TEST_TAB1      VALUES    (20110203,      'MOSCOW',      2000, 'P');
           INSERT    INTO   TEST_TAB1      VALUES    (20110204,      'MOSCOW',      1000, 'P');
           INSERT    INTO   TEST_TAB1      VALUES    (20110205,      'MOSCOW',      2400, 'P');
           INSERT    INTO   TEST_TAB1      VALUES    (20110201,      'LONDON',      500, 'P');
           INSERT    INTO   TEST_TAB1      VALUES    (20110201,      'SYDNEY',      700, 'P');
           INSERT    INTO   TEST_TAB1      VALUES    (20110202,      'SYDNEY',      1200, 'P');
           INSERT    INTO   TEST_TAB1      VALUES    (20110201,      'DALLAS',      400, 'P');
           INSERT    INTO   TEST_TAB1      VALUES    (20110206,      'DALLAS',      600, 'P');
           INSERT    INTO   TEST_TAB1      VALUES    (20110204,      'DALLAS',      700, 'P');

           COMMIT;

There is a range partition per REPORTING_DATE which is list sub-partitioned depending on the SOURCE_SYSTEM that sent the
data, with an OTHERS sub-partition for SOURCE_SYSTEMS other than MOSCOW, LONDON and SYDNEY. There are no statistics


3 March 2012                                                                   2                   Doug Burns – Hotsos Symposium 2011
on this table or index so as a first step, I’ll gather statistics using the default values for all parameters other than the schema
owner and table name.

Listing 2 –Gather Statistics with Default parameter values (DEFAULT.SQL) 1

          SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1');

          PL/SQL procedure successfully completed.

Those familiar with gathering statistics will expect to see some of the updated statistics in views such as USER_TABLES and
USER_INDEXES.

Listing 3 –View basic table and index statistics (DEFAULT.SQL)

          SQL> select table_name, global_stats, last_analyzed, num_rows
            2 from user_tables
            3 where table_name='TEST_TAB1';

          TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
          -------------- --- -------------------- ----------
          TEST_TAB1      YES 05-FEB-2011 12:38:11         11

          SQL> select index_name, last_analyzed, distinct_keys, clustering_factor, global_stats
            2 from user_indexes
            3 where table_name='TEST_TAB1';

          INDEX_NAME                     LAST_ANALYZED        DISTINCT_KEYS CLUSTERING_FACTOR GLO
          ------------------------------ -------------------- ------------- ----------------- ---
          TEST_TAB1_IX1                  05-FEB-2011 12:38:11            11                11 YES

                                                                 th
As you can see, the statistics were gathered on the 5 February and the table-level NUM_ROWS=11 which reflects the INSERT
statements in Listing 1 accurately. However, you should also note the GLOBAL_STATS column which contains ‘YES’ for both the
table and index. That is the first hint of the additional layer of complexity introduced with partitioned objects. Because the
statistics we are looking at are no longer just Table or Index statistics but Global Statistics. We will also find that additional
statistics were gathered on other levels of the objects because remember that our ‘table’ consists of the table itself, the
partitions and the sub-partitions. For example, by looking in DBA_TAB_PARTITIONS, we’ll see some of those additional statistics.




1
 At this stage all of the examples are executed against the default general purpose database created by DBCA as part of the
Oracle 10.2.0.4 installation process, with no additional one-off patches applied. This means that you can download Oracle
10.2.0.4 from http://www.oracle.com/technetwork/database/database10g/downloads/index.html and run the scripts in
Appendix B yourself to see the full results. In this paper, I will limit the output displayed to that required to illustrate my points.




3 March 2012                                                                 3                 Doug Burns – Hotsos Symposium 2011
Listing 4 –View partition statistics (DEFAULT.SQL)

           SQL>   select table_name, partition_name, global_stats, last_analyzed, num_rows
             2    from user_tab_partitions
             3    where table_name='TEST_TAB1'
             4    order by partition_name;

           TABLE_NAME            PARTITION_NAME      GLO   LAST_ANALYZED          NUM_ROWS
           --------------        ----------------    ---   -------------------- ----------
           TEST_TAB1             P_20110131          YES   05-FEB-2011 12:38:11          0
           TEST_TAB1             P_20110201          YES   05-FEB-2011 12:38:11          4
           TEST_TAB1             P_20110202          YES   05-FEB-2011 12:38:11          2
           TEST_TAB1             P_20110203          YES   05-FEB-2011 12:38:11          1
           TEST_TAB1             P_20110204          YES   05-FEB-2011 12:38:11          2
           TEST_TAB1             P_20110205          YES   05-FEB-2011 12:38:11          1
           TEST_TAB1             P_20110206          YES   05-FEB-2011 12:38:11          1
           TEST_TAB1             P_20110207          YES   05-FEB-2011 12:38:11          0

           8 rows selected.

These additional Partition statistics also appear to be Global (according to the view) and show the number of rows contained in
each partition. As we’ll see later, statistics are not limited to row counts but even the addition of this basic information about
each partition is useful to the CBO in determining the profile of our data to arrive at the optimal execution plan.

           Note – at this stage I will drop the index and use table statistics to demonstrate the underlying principles that apply to
           both tables and indexes for the rest of the paper.

COLLECTION GRANULARITY AND STATISTICS LEVELS

There are several logical components of partitioned objects and each level of components has its own set of statistics.

Partitioned Objects

          Global Statistics that describe the entire object.
          Partition Statistics on each Partition.

Subpartitioned (or Composite Partitioned) Objects

          Global Statistics that describe the entire object.
          Partition Statistics that describe the entire Partition and all of its Subpartitions.
          Subpartition Statistics on each Subpartition.


The CBO can use different combinations of these statistics depending on which components the query must access and the
version of Oracle. We’ll revisit how the statistics are used shortly but first of all, how do we gather the different levels of
statistics in the first place?
                                                                             2
When using Partitioned Tables and Indexes, you can use DBMS_STATS to gather separate statistics on the various components
of the objects. The level at which statistics are gathered is governed by the DBMS_STATS GRANULARITY parameter, which
allows the following values in Oracle 10g and 11g.




2
 In fact you should always use DBMS_STATS for manipulating statistics these days but you must use DBMS_STATS to gather
Partition statistics. The ANALYZE command is for dinosaurs only and supports a fraction of the DBMS_STATS functionality.



3 March 2012                                                          4                            Doug Burns – Hotsos Symposium 2011
         DBMS_STATS GRANULARITY value                    Statistics Gathered

         ALL                                             Global, Partition and Subpartition
         AUTO                                            Determines granularity based on partitioning type (This is the default)
         DEFAULT                                         Gathers global and partition-level statistics. This option is deprecated,
                                                         and while currently supported, it is included in the documentation for
                                                         legacy reasons only. You should use 'GLOBAL AND PARTITION' for this
                                                         functionality. Note that the default value is now 'AUTO'.
         GLOBAL                                          Global
         GLOBAL AND PARTITION                            Global and Partition, but does not gather Subpartition stats, even if the
                                                         object is subpartitioned
         PARTITION                                       Partition (specify PARTNAME for a specific partition. Default is all
                                                         partitions.)
         SUBPARTITION                                    Subpartition (specify PARTNAME for a specific subpartition. Default is all
                                                         subpartitions.)


For example, in Listing 2 we gathered statistics on TEST_TAB1 with the default DBMS_STATS options which implies that we used
GRANULARITY => ‘AUTO’ and that Oracle determined the granularity based on the partitioning type. What did this mean in
practice? Well, as we saw in Listings 3 and 4, DBMS_STATS gathered

         Global Statistics on the Table (Listing 3)
         Partition Statistics on all of the Partitions (Listing 4)

How about the Subpartition Statistics? Here is a small sample of the results that shows that Subpartition Statistics were
gathered. (Note that this is different to the previous 9i default value of ‘DEFAULT’ that did not collect Subpartition Statistics)

Listing 5 –View subpartition statistics for the P_20110201 partition (DEFAULT.SQL)

          SQL>    select table_name, subpartition_name, global_stats, last_analyzed, num_rows
            2     from user_tab_subpartitions
            3     where table_name='TEST_TAB1'
            4     and partition_name='P_20110201'
            5     order by subpartition_name;

          TABLE_NAME            SUBPARTITION_NAME          GLO   LAST_ANALYZED          NUM_ROWS
          --------------        ------------------         ---   -------------------- ----------
          TEST_TAB1             P_20110201_LONDON          YES   05-FEB-2011 12:44:36          1
          TEST_TAB1             P_20110201_MOSCOW          YES   05-FEB-2011 12:44:36          1
          TEST_TAB1             P_20110201_OTHERS          YES   05-FEB-2011 12:44:36          1
          TEST_TAB1             P_20110201_SYDNEY          YES   05-FEB-2011 12:44:36          1




HOW STATISTICS LEVEL S ARE USED

Why does Oracle gather statistics about the various levels of partitioned objects? Because the CBO will use different statistics in
different situations and although the precise details are likely to change frequently between versions, you can probably rely on
a few high-level points.

         If a statement accesses multiple partitions the CBO will use Global Statistics.

         If a statement is able to limit access to a single partition, then the partition statistics can be used.




3 March 2012                                                                 5                      Doug Burns – Hotsos Symposium 2011
          If a statement accesses a single subpartition, then subpartition statistics can be used. However, prior to 10.2.0.4,
           subpartition statistics are rarely used. The CBO will use BLOCKS in Table Subpartition statistics to calculate the cost of a
           full scan of the Subpartition but does not use any Index Subpartition Statistics. For practical purposes, it is probably
           best to assume that Subpartition Statistics are not very valuable in any version prior to 10.2.0.4.

          For most applications you will need both Global and Partition Statistics for the CBO to operate effectively which is why
           Oracle’s default DBMS_STATS options have always included Global Statistics.

AGGREGATED GLOBAL STATISTICS AND THE QUALITY/PERFORMANCE TRADE-OFF

If the CBO is so dependent on a combination of global and partition statistics then why not use the default GRANULARITY of
AUTO every time we re-gather statistics on updated data? Well partitioned objects are normally used because accessing the
entire object is resource-intensive and slow and you’re aiming for SQL execution plans that access a small subset of partitions
instead. Collecting global statistics requires a full table scan and is therefore likely to be a prohibitively expensive operation;
particularly if you are regularly adding new partitions which make the Global Statistics become stale even if you have gathered
up to date statistics on the new partitions.

You might be able to schedule a window outside of your standard operating hours to regather global statistics on large objects
occasionally, but this would still mean that your global statistics may become stale. You might also decide to gather global
statistics using a very small sample (low ESTIMATE_PERCENT) to speed up the processs but as the sampling percentage falls it
will typically affect the quality of the resulting statistics negatively.

Oracle offers an alternative to gathering Global Statistics, which is to use the statistics on the underlying partitions as the source
                                         3
to generate Aggregated Global Statistics at higher levels of the same object. A valid statistics collection strategy for a
composite partitioned object might look something like this

          Gather statistics at the subpartition level for all subpartitions and allow Oracle to generate Aggregated Global Statistics
           at the Table and Partition levels

          As new subpartitions are added or the data changes sufficiently in an existing subpartition to make the statistics stale,
           only gather statistics on the relevant subpartitions and allow the aggregation process to maintain the Table and
           Partition level Aggregated Statistics.

I'll empty TEST_TAB1, delete the existing table stats and regather at the SUBPARTITION level. Note that I have not passed in a
specific subpartition with the PARTNAME parameter, so all subpartitions will have their statistics gathered.

Listing 6 –Gather subpartition statistics with aggregation and view statistics (AGGREGATION.SQL)

           SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', GRANULARITY => 'SUBPARTITION');

           PL/SQL procedure successfully completed.

           SQL> select table_name, global_stats, last_analyzed, num_rows
             2 from user_tables
             3 where table_name='TEST_TAB1';

           TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
           -------------- --- -------------------- ----------




3
    It might help you to think of these as Estimated or Approximate Global Statistics, for reasons that will become apparent.



3 March 2012                                                                6                      Doug Burns – Hotsos Symposium 2011
           TEST_TAB1            NO    05-FEB-2011 12:51:27                       11

           SQL>   select table_name, partition_name, global_stats, last_analyzed, num_rows
             2    from user_tab_partitions
             3    where table_name='TEST_TAB1'
             4    order by partition_name;

           TABLE_NAME           PARTITION_NAME           GLO   LAST_ANALYZED          NUM_ROWS
           --------------       ----------------         ---   -------------------- ----------
           TEST_TAB1            P_20110131               NO    05-FEB-2011 12:51:27          0
           TEST_TAB1            P_20110201               NO    05-FEB-2011 12:51:27          4
           TEST_TAB1            P_20110202               NO    05-FEB-2011 12:51:27          2
           TEST_TAB1            P_20110203               NO    05-FEB-2011 12:51:27          1
           TEST_TAB1            P_20110204               NO    05-FEB-2011 12:51:27          2
           TEST_TAB1            P_20110205               NO    05-FEB-2011 12:51:27          1
           TEST_TAB1            P_20110206               NO    05-FEB-2011 12:51:27          1
           TEST_TAB1            P_20110207               NO    05-FEB-2011 12:51:27          0

           8 rows selected.

           SQL>   select table_name, subpartition_name, global_stats, last_analyzed, num_rows
             2    from user_tab_subpartitions
             3    where table_name='TEST_TAB1'
             4    and partition_name='P_20110206'
             5    order by subpartition_name;

           TABLE_NAME           SUBPARTITION_NAME           GLO   LAST_ANALYZED          NUM_ROWS
           --------------       ------------------          ---   -------------------- ----------
           TEST_TAB1            P_20110206_LONDON           YES   05-FEB-2011 12:51:27          0
           TEST_TAB1            P_20110206_MOSCOW           YES   05-FEB-2011 12:51:27          0
           TEST_TAB1            P_20110206_OTHERS           YES   05-FEB-2011 12:51:27          1
           TEST_TAB1            P_20110206_SYDNEY           YES   05-FEB-2011 12:51:27          0
                                                                         4
I can see that although the Subpartition Statistics are global , the Table and Partition statistics are not (GLOBAL=NO), but the
number of rows looks correct so the aggregation process appears to have been successful and we have avoided the full table
scan overhead of collecting true global statistics. Next I’ll simulate what might happen next – new data is inserted into
subartition P_20110206_MOSCOW and so we want to gather fresh statistics on just that subpartition, without the overhead of
accessing the entire table.

Listing 7 –Insert additional rows into subpartition, gather subpartition statistics and view results (AGGREGATION.SQL)

           SQL> INSERT INTO TEST_TAB1 VALUES (20110206, 'MOSCOW', 100000, 'P');

           1 row created.

           SQL> INSERT INTO TEST_TAB1 VALUES (20110206, 'MOSCOW', 3000000, 'P');

           1 row created.

           SQL> INSERT INTO TEST_TAB1 VALUES (20110206, 'MOSCOW', 200000, 'P');

           1 row created.

           SQL> INSERT INTO TEST_TAB1 VALUES (20110206, 'MOSCOW', 110000, 'P');

           1 row created.

           SQL> INSERT INTO TEST_TAB1 VALUES (20110206, 'MOSCOW', 240000, 'U');

           1 row created.

           SQL> COMMIT;




4
    Subpartition Statistics are implicitly Global because there is no structural level below subpartitions.



3 March 2012                                                                 7                                Doug Burns – Hotsos Symposium 2011
          Commit complete.

          SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1',
                                GRANULARITY => 'SUBPARTITION', PARTNAME => 'P_20110206_MOSCOW');

          PL/SQL procedure successfully completed.

          SQL> select table_name, global_stats, last_analyzed, num_rows
            2 from user_tables
            3 where table_name='TEST_TAB1';

          TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
          -------------- --- -------------------- ----------
          TEST_TAB1      NO 05-FEB-2011 12:51:27          16

          SQL>   select table_name, partition_name, global_stats, last_analyzed, num_rows
            2    from user_tab_partitions
            3    where table_name='TEST_TAB1'
            4    order by partition_name;

          TABLE_NAME          PARTITION_NAME          GLO   LAST_ANALYZED          NUM_ROWS
          --------------      ----------------        ---   -------------------- ----------
          TEST_TAB1           P_20110131              NO    05-FEB-2011 12:51:27          0
          TEST_TAB1           P_20110201              NO    05-FEB-2011 12:51:27          4
          TEST_TAB1           P_20110202              NO    05-FEB-2011 12:51:27          2
          TEST_TAB1           P_20110203              NO    05-FEB-2011 12:51:27          1
          TEST_TAB1           P_20110204              NO    05-FEB-2011 12:51:27          2
          TEST_TAB1           P_20110205              NO    05-FEB-2011 12:51:27          1
          TEST_TAB1           P_20110206              NO    05-FEB-2011 12:51:27          6
          TEST_TAB1           P_20110207              NO    05-FEB-2011 12:51:27          0

          8 rows selected.

The table-level NUM_ROWS has increased from 11 to 16 and the partition P_20110206 has increased from 1 to 6, both of which
are a perfect reflection of the INSERT activity and at this stage I would argue that the aggregation process looks to be working
pretty well. Although this is only a very small table, the resource savings and reduced elapsed time for gathering statistics on a
single sub-partition and updating the aggregated statistics at higher levels are likely to be very significant on large partitioned
objects and might be the only realistic alternative to having missing or very stale global statistics.

However, there is a significant flaw to this approach. When I inserted new data, I also introduced a new value in the STATUS
column of ‘U’ and it’s only when we begin to look at the column statistics that things start to look less healthy.

Listing 8 –View Aggregated Global column statistics 5(AGGREGATION.SQL)

          SQL>   select a.column_name, a.num_distinct,
            2      display_raw(a.low_value,b.data_type) as low_val,
            3      display_raw(a.high_value,b.data_type) as high_val, b.data_type
            4    from user_tab_col_statistics a, user_tab_cols b
            5    where a.table_name='TEST_TAB1'
            6    and   a.table_name=b.table_name
            7    and   a.column_name=b.column_name
            8    order by a.column_name;

          COLUMN_NAME                NUM_DISTINCT LOW_VAL              HIGH_VAL                             DATA_TYPE
          -------------------------- ------------ -------------------- --------------------                 -------------------
          REPORTING_DATE                        6 20110201             20110206                             NUMBER
          SEQ_ID                               14 400                  3000000                              NUMBER
          SOURCE_SYSTEM                         5 DALLAS               SYDNEY                               VARCHAR2
          STATUS                                4 P                    U                                    VARCHAR2




5
 This script uses Greg Rahn’s DISPLAY_RAW function to display Column High and Low values in a human-readable form. See
References.



3 March 2012                                                             8                    Doug Burns – Hotsos Symposium 2011
        SQL>
        SQL>   select a.column_name, a.partition_name, a.num_distinct,
          2      display_raw(a.low_value,b.data_type) as low_val,
          3      display_raw(a.high_value,b.data_type) as high_val,
          4      b.data_type
          5    from user_part_col_statistics a, user_tab_cols b
          6    where a.table_name='TEST_TAB1'
          7    and a.table_name=b.table_name
          8    and a.column_name=b.column_name
          9    and a.partition_name='P_20110206'
         10    order by a.column_name, a.partition_name;

        COLUMN_NAME                      PARTITION_NAME      NUM_DISTINCT LOW_VAL                     HIGH_VAL
        DATA_TYPE
        --------------------------       ---------------- ------------ -------------------- -------------------- --
        ------------------
        REPORTING_DATE                   P_20110206                       1 20110206                  20110206
        NUMBER
        SEQ_ID                           P_20110206                       6 600                       3000000
        NUMBER
        SOURCE_SYSTEM                    P_20110206                       2 DALLAS                    MOSCOW
        VARCHAR2
        STATUS                           P_20110206                       3 P                         U
        VARCHAR2

At first glance the table-level statistics look reasonably accurate. For example, the STATUS column has the correct High Value of
'U', which has just appeared in the last set of rows that were inserted. Based on what I've seen to date, Oracle does accurately
update the High/Low column values and row counts when generating aggregated statistics but closer investigation highlights a
problem. According to the column statistics, there are 5 distinct SOURCE_SYSTEMs and 4 distinct STATUSes in the table, but
that's not true.




3 March 2012                                                    9                         Doug Burns – Hotsos Symposium 2011
Listing 9 –True number of distinct values in test data (AGGREGATION.SQL)

          SQL> select source_system, count(*) from test_tab1 group by source_system;

          SOURCE_SYSTEM                    COUNT(*)
          ------------------------------ ----------
          MOSCOW                                 10
          LONDON                                  1
          SYDNEY                                  2
          DALLAS                                  3

          SQL>
          SQL> select status, count(*) from test_tab1 group by status;

          S   COUNT(*)
          - ----------
          U          1
          P         15

As far as the CBO is concerned when looking at these statistics there are more distinct values for each of these columns in the
table and the Number of Distinct Values (NDV) is one of the most important inputs to the CBO cardinality calculations.

For example, if I execute a query to return all of the rows with STATUS=’U’ there should be one row returned but if I include a
/*+ gather_plan_statistics */ hint along with DBMS_XPLAN.DISPLAY_CURSOR to view actual row source execution statistics I can
see the discrepancy between the optimisers estimated cardinalities (E-Rows) and the actual cardinalities (A-Rows).

Listing 10 –Incorrect cardinality estimate arising from inaccurate NDV (AGGREGATION.SQL)

          SQL> select /*+ gather_plan_statistics */ *
            2 from test_tab1
            3 where status='U';

          REPORTING_DATE SOURCE_SYSTEM                      SEQ_ID S
          -------------- ------------------------------ ---------- -
                20110206 MOSCOW                             240000 U

          SQL>
          SQL> select * from TABLE(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

          PLAN_TABLE_OUTPUT
          -----------------------------------------------------------------------------------------------------
          -------------------------------
          SQL_ID 6fnb7kqfhx69j, child number 0
          -------------------------------------
          select /*+ gather_plan_statistics */ * from test_tab1 where status='U'

          Plan hash value: 2028438416

          -------------------------------------------------------------------------------------------
          | Id | Operation            | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
          -------------------------------------------------------------------------------------------
          |   1 | PARTITION RANGE ALL|            |      1 |      4 |      1 |00:00:00.01 |     312 |
          |   2 |   PARTITION LIST ALL|           |      8 |      4 |      1 |00:00:00.01 |     312 |
          |* 3 |     TABLE ACCESS FULL| TEST_TAB1 |     32 |      4 |      1 |00:00:00.01 |     312 |
          -------------------------------------------------------------------------------------------

          Predicate Information (identified by operation id):
          ---------------------------------------------------

              3 - filter("STATUS"='U')


          19 rows selected.




3 March 2012                                                               10              Doug Burns – Hotsos Symposium 2011
The optimiser expects there to be 4 rows returned (E-Rows=4) but only 1 row is actually returned (A-Rows=1). How did the
optimiser arrive at that estimate? Dividing the number of rows in the table (16) by the number of distinct values in the table for
STATUS (4) gives an estimated 4 rows returned. Oracle uses the table-level statistics because we are retrieving data from more
than one partition.

The estimated cardinality if the NDV had been the correct value of 2 would have been even more inaccurate. Oracle assumes
that with two distinct values of STATUS then we would be retrieving 8 rows instead of 1, but that is a data skew issue that might
more sensibly be solved with the use of a histogram. Regardless of the effectiveness of the true NDV in this example versus the
approximate NDV generated by the aggregation process, NDV inaccuracies with such a small number of rows and only a few
distinct values indicates how significant a problem this could become with larger volumes of data.

Why aren't the values accurate? Well let's compare High/Low values to Number of Distinct Values (NDV). When we gathered
statistics on the new subpartition, we had access to the previous High/Low values at the table level. Here are the column
statistics before stats were gathered on the new subpartition.

Listing 11 –Original Table Column Statistics (AGGREGATION.SQL)

          COLUMN_NAME                NUM_DISTINCT LOW_VAL              HIGH_VAL                               DATA_TYPE
          -------------------------- ------------ -------------------- --------------------                   -------------
          REPORTING_DATE                        6 20110201             20110206                               NUMBER
          SEQ_ID                                9 400                  30000                                  NUMBER
          SOURCE_SYSTEM                         6 DALLAS               SYDNEY                                 VARCHAR2
          STATUS                                1 P                    P                                      VARCHAR2

As the aggregated statistics are updated, Oracle has the current low and high values at all levels of the object and adjusting
those based on the underlying statistics is straightforward. The highest value in the new subpartition statistics of ‘U’ is clearly
higher than the existing high value at the table and partitions levels of ‘P’.

Listing 12 –Table Column Statistics after additional aggregation (AGGREGATION.SQL)

          COLUMN_NAME                NUM_DISTINCT LOW_VAL              HIGH_VAL                               DATA_TYPE
          -------------------------- ------------ -------------------- --------------------                   -------------
          REPORTING_DATE                        6 20110201             20110206                               NUMBER
          SEQ_ID                               14 400                  3000000                                NUMBER
          SOURCE_SYSTEM                         5 DALLAS               SYDNEY                                 VARCHAR2
          STATUS                                4 P                    U                                      VARCHAR2

How does Oracle determine the new NDV? Remember, Oracle can't look at any of the data in Partitions or Subpartitions other
than the one we're gathering statistics for in order to minimize the collection overhead. So it has to decide what the new NDV
should be based on several inputs

     1.   The actual values in STATUS for the rows in the subpartition we can look at.
     2.   The previously-gathered and stored NDV for the other subpartitions.
     3.   The previously-aggregated NDV stored at the table and partition levels.

The problem is that Oracle knows the number of distinct values in other subpartitions (we looked at the data previously to
calculate them) but not the values themselves because the NDV calculation is a destructive process that loses information about
what the distinct values are and only preserves how many there are. Without the full original range of values, how can it say
whether the 2 distinct values (P and U) in this subpartition are distinct when compared to the values in the other subpartitions
without looking at those subpartitions again?

In this case we might expect Oracle to be clever and realise that, as there was only one distinct value of P prior to our new
subpartition and the only values in the current subpartition are P and U, that there can only be 2 distinct values. Unfortunately



3 March 2012                                                              11                  Doug Burns – Hotsos Symposium 2011
that is not how global statistics maintenance works prior to Oracle 11g.

The poor NDV results of the aggregation process is probably the main reason Oracle recommends gathering both Global and
Partition statistics and does so by default if we don’t request otherwise. However some systems might have little opportunity to
gather true global statistics and although aggregated statistics are a poor alternative and will lead to sub-optimal execution
plans they are definitely preferable to no global statistics at all, as we’ll see shortly.

Please note, though, that if you do use the aggregation process, you must do so correctly and not fall into one of the following
traps.

AGGREGATION SCENARIOS AND PITFALLS


Putting aside the fundamental weakness of the aggregation process for a while, the reality is that many systems have used it as
the only viable alternative to maintaining global statistics in the past. However, there are strict requirements for the process to
work correctly and the implications if you don’t ensure you meet these requirements can be dramatic.


MISSING PARTITION ST ATISTICS


To illustrate the first potential problem, I’ll continue the previous example that gathered statistics purely at the SUBPARTITION
level and allowed Oracle to aggregate those to generate the Table and Partition stats. I’ll add a new partition (and because we
have a subpartition template, the subpartitions are created too) much as we might during the first load of data on a new day,
then I'll insert some rows into the different new subpartitions.

Listing 13 –Add partition and INSERT new data for 20110208 from Dallas then gather statistics on the relevant subpartition (AGGREGATION2.SQL)

          SQL> ALTER TABLE TEST_TAB1
            2 ADD PARTITION P_20110208 VALUES LESS THAN (20110209);

          Table altered.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110208, 'DALLAS', 400, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110208, 'DALLAS', 600, 'P');

          1 row created.

          SQL> COMMIT;

          Commit complete.

          SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1',
                                granularity => 'SUBPARTITION', partname => 'P_20110208_OTHERS');

          PL/SQL procedure successfully completed.

          SQL>   select table_name, subpartition_name, global_stats, last_analyzed, num_rows
            2    from user_tab_subpartitions
            3    where table_name='TEST_TAB1'
            4    and partition_name='P_20110208'
            5    order by subpartition_name;

          TABLE_NAME           SUBPARTITION_NAME          GLO LAST_ANALYZED          NUM_ROWS
          --------------       ------------------         --- -------------------- ----------
          TEST_TAB1            P_20110208_LONDON          NO
          TEST_TAB1            P_20110208_MOSCOW          NO



3 March 2012                                                              12                              Doug Burns – Hotsos Symposium 2011
           TEST_TAB1             P_20110208_OTHERS          YES 05-FEB-2011 13:11:15                     2
           TEST_TAB1             P_20110208_SYDNEY          NO

At this stage the subpartitition statistics look acceptable for P_20110208_OTHERS however examining the Table and Partition
statistics highlights a serious issue.

Listing 14 –Table and partition statistics after subpartition statistics collection (AGGREGATION2.SQL)

           SQL> select table_name, global_stats, last_analyzed, num_rows
             2 from user_tables
             3 where table_name='TEST_TAB1';

           TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
           -------------- --- -------------------- ----------
           TEST_TAB1      NO

           SQL>
           SQL>   select table_name, partition_name, global_stats, last_analyzed, num_rows
             2    from user_tab_partitions
             3    where table_name='TEST_TAB1'
             4    order by partition_name;

           TABLE_NAME            PARTITION_NAME           GLO   LAST_ANALYZED          NUM_ROWS
           --------------        ----------------         ---   -------------------- ----------
           TEST_TAB1             P_20110131               NO    05-FEB-2011 13:11:10          0
           TEST_TAB1             P_20110201               NO    05-FEB-2011 13:11:10          8
           TEST_TAB1             P_20110202               NO    05-FEB-2011 13:11:10          4
           TEST_TAB1             P_20110203               NO    05-FEB-2011 13:11:10          2
           TEST_TAB1             P_20110204               NO    05-FEB-2011 13:11:10          4
           TEST_TAB1             P_20110205               NO    05-FEB-2011 13:11:10          2
           TEST_TAB1             P_20110206               NO    05-FEB-2011 13:11:10          7
           TEST_TAB1             P_20110207               NO    05-FEB-2011 13:11:10          0
           TEST_TAB1             P_20110208               NO

           9 rows selected.

The Table level statistics have disappeared as though they have been deleted or were never gathered in the first place! There
are apparently no statistics on the new P_20110208 partition either. The underlying cause is that Oracle will only aggregate
statistics when all of the underlying components have stats that can be aggregated. At this stage, P_20100208_OTHERS is the
only subpartition of P_20100208 that I have gathered stats for so in the absence of stats for P_20110208_LONDON,
P_20110208_MOSCOW and P_20110208_SYDNEY, no aggregated Partition statistics will be set for P_20110208 and as those
are missing, the Table level aggregated statistics are missing too.

Missing statistics is one of the more challenging situation for the optimiser and, prior to Oracle 9.2 could lead to terrible plans
based on built-in optimiser default values are unlikely to reflect your data accurately. Current versions of Oracle will try to solve
the problem by generating statistics using Dynamic Sampling unless this has been explicitly disabled. For example, if I run a
query that must use the table statistics, we can see that Dynamic Sampling takes place because of the default
dynamic_sampling setting of 2.




Listing 15 – Table-wide query with missing Table and Partition statistics (AGGREGATION2.SQL)

           SQL> show parameter dynamic_sampling

           optimizer_dynamic_sampling                            integer           2



3 March 2012                                                                  13                         Doug Burns – Hotsos Symposium 2011
          SQL> select /*+ gather_plan_statistics */ *
            2 from test_tab1
            3 where source_system='DALLAS';

                  20110201     DALLAS                                           400   P
                  20110204     DALLAS                                           700   P
                  20110206     DALLAS                                           600   P
                  20110208     DALLAS                                           400   P
                  20110208     DALLAS                                           600   P

          SQL> select * from TABLE(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

          SQL_ID 544krkd2kvxat, child number 0
          -------------------------------------
          select /*+ gather_plan_statistics */ * from test_tab1 where source_system='DALLAS'

          Plan hash value: 3088684340

          ----------------------------------------------------------------------------------------------
          | Id | Operation               | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
          ----------------------------------------------------------------------------------------------
          |   1 | PARTITION RANGE ALL    |           |      1 |      9 |      5 |00:00:00.01 |     100 |
          |   2 |   PARTITION LIST SINGLE|           |      9 |      9 |      5 |00:00:00.01 |     100 |
          |* 3 |     TABLE ACCESS FULL   | TEST_TAB1 |      9 |      9 |      5 |00:00:00.01 |     100 |
          ----------------------------------------------------------------------------------------------

          Predicate Information (identified by operation id):
          ---------------------------------------------------

              3 - filter("SOURCE_SYSTEM"='DALLAS')

          Note
          -----
             - dynamic sampling used for this statement

However, although Oracle has come up with a reasonable plan that assumed 9 rows would be returned rather than the actual 5
rows, dynamic sampling consumes additional resources, is difficult to spot without looking at detailed execution plans and will
not work as effectively on larger and more complex data sets. Therefore any missing statistics on components of partitioned
objects should be treated very seriously.

The only defence against missing statistics resulting from the aggregation process is to design your statistics collection very
carefully if you depend on aggregation. Let's look at the statistics once the other data feeds complete (additional rows are
INSERTed into the other partitions in aggregation2.sql) and we gather the rest of the subpartition statistics.

Listing 16 – Gather remaining subpartition statistics (AGGREGATION2.SQL)

          SQL> -- Simulate other loads finishing later
          SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', granularity => 'SUBPARTITION',
          partname => 'P_20110208_LONDON');

          PL/SQL procedure successfully completed.

          SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', granularity => 'SUBPARTITION',
          partname => 'P_20110208_SYDNEY');

          PL/SQL procedure successfully completed.

          SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', granularity => 'SUBPARTITION',
          partname => 'P_20110208_MOSCOW');

          PL/SQL procedure successfully completed.

          SQL> select table_name, global_stats, last_analyzed, num_rows
            2 from user_tables
            3 where table_name='TEST_TAB1';




3 March 2012                                                               14               Doug Burns – Hotsos Symposium 2011
          TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
          -------------- --- -------------------- ----------
          TEST_TAB1      NO 05-FEB-2011 13:11:15          37

          SQL>   select table_name, partition_name, global_stats, last_analyzed, num_rows
            2    from user_tab_partitions
            3    where table_name='TEST_TAB1'
            4    order by partition_name;

          TABLE_NAME          PARTITION_NAME         GLO   LAST_ANALYZED          NUM_ROWS
          --------------      ----------------       ---   -------------------- ----------
          TEST_TAB1           P_20110131             NO    05-FEB-2011 13:11:10          0
          TEST_TAB1           P_20110201             NO    05-FEB-2011 13:11:10          8
          TEST_TAB1           P_20110202             NO    05-FEB-2011 13:11:10          4
          TEST_TAB1           P_20110203             NO    05-FEB-2011 13:11:10          2
          TEST_TAB1           P_20110204             NO    05-FEB-2011 13:11:10          4
          TEST_TAB1           P_20110205             NO    05-FEB-2011 13:11:10          2
          TEST_TAB1           P_20110206             NO    05-FEB-2011 13:11:10          7
          TEST_TAB1           P_20110207             NO    05-FEB-2011 13:11:10          0
          TEST_TAB1           P_20110208             NO    05-FEB-2011 13:11:15         10

          9 rows selected.

That looks much better. The best approach in this particular case is probably to gather statistics on all of the new subpartitions
as soon as they are created. Even though the subpartitions will be empty initially and will need stats gathered again
immediately after they are loaded, the initial collection will be very cheap and minimise the likelihood of missing statistics
leading to aggregation problems.


INABILITY TO REPLACE TRUE GLOBAL STATISTICS


To illustrate the next potential aggregation problem, I’ll revert to the status of TEST_TAB1 after listings 1, 2, 3 and 4. i.e. The
table has been recreated, populated with initial data and statistics have been gathered using the default granularity of AUTO.
The table has global statistics at the Table, Partition and Subpartition levels. This is a sensible default starting point and it’s
worth noting how likely a situation this is if the table has been created and then the statistics have been updated by Oracle’s
automatic statistics gathering job.

Imagine next that we are about to insert more data for 20110206 from the MOSCOW source system. As we are introducing new
data, we will also regather the statistics.

Listing 17 – Insert data for 20110206/MOSCOW and update statistics6 (AGGREGATION2.SQL)

          SQL> INSERT INTO TEST_TAB1 VALUES (20110206, 'MOSCOW', 100000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110206, 'MOSCOW', 3000000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110206, 'MOSCOW', 200000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110206, 'MOSCOW', 110000, 'P');




6
 In the AGGREGATION2.SQL script, I’ve inserted a PAUSE point to simulate the delay between the initial population and the
subsequent arrival of the new data.




3 March 2012                                                           15                     Doug Burns – Hotsos Symposium 2011
          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110206, 'MOSCOW', 240000, 'U');

          1 row created.

          SQL> COMMIT;

          Commit complete.

          SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', GRANULARITY => 'SUBPARTITION',
          PARTNAME => 'P_20110206_MOSCOW');

          PL/SQL procedure successfully completed.

          SQL>   select table_name, subpartition_name, global_stats, last_analyzed, num_rows
            2    from user_tab_subpartitions
            3    where table_name='TEST_TAB1'
            4    and partition_name='P_20110206'
            5    order by subpartition_name;

          TEST_TAB1            P_20110206_LONDON          YES   09-FEB-2011      02:45:17       0
          TEST_TAB1            P_20110206_MOSCOW          YES   09-FEB-2011      02:45:31       5
          TEST_TAB1            P_20110206_OTHERS          YES   09-FEB-2011      02:45:17       1
          TEST_TAB1            P_20110206_SYDNEY          YES   09-FEB-2011      02:45:17       0

I can see that the statistics for the P_20110206_MOSCOW subpartition have been updated because the LAST_ANALYZED time is
a little later. How do the Table and Partition statistics look?

Listing 18 – Table and Partition Statistics after subsequent data load (AGGREGATION2.SQL)

          SQL> select table_name, global_stats, last_analyzed, num_rows
            2 from user_tables
            3 where table_name='TEST_TAB1';

          TEST_TAB1            YES 09-FEB-2011 02:45:17                         11

          SQL>
          SQL>   select table_name, partition_name, global_stats, last_analyzed, num_rows
            2    from user_tab_partitions
            3    where table_name='TEST_TAB1'
            4    order by partition_name;

          TEST_TAB1            P_20110131               YES   09-FEB-2011       02:45:17    0
          TEST_TAB1            P_20110201               YES   09-FEB-2011       02:45:17    4
          TEST_TAB1            P_20110202               YES   09-FEB-2011       02:45:17    2
          TEST_TAB1            P_20110203               YES   09-FEB-2011       02:45:17    1
          TEST_TAB1            P_20110204               YES   09-FEB-2011       02:45:17    2
          TEST_TAB1            P_20110205               YES   09-FEB-2011       02:45:17    1
          TEST_TAB1            P_20110206               YES   09-FEB-2011       02:45:17    1
          TEST_TAB1            P_20110207               YES   09-FEB-2011       02:45:17    0
          TEST_TAB1            P_20110208               YES   09-FEB-2011       02:45:17    0

          9 rows selected.

Glance at these statistics too quickly and they might appear to be correct but there are a couple of possible concerns

     1.   The LAST_ANALYZED time hasn’t changed and reflects the initial data population time.

     2.   The Table and P_20110206 NUM_ROWS don’t appear to have changed.

The reason is that DBMS_STATS will not replace true Global Statistics (GLOBAL=YES) with the results of the aggregation process.
Perhaps this is sensible because it will not replace carefully gathered global statistics with approximate, aggregated statistics
and allows you to combine different collection granularities as part of an overall strategy. For example

     1.   Initial GRANULARITY=> AUTO to ensure Global Statistics are good at all levels.



3 March 2012                                                               16                   Doug Burns – Hotsos Symposium 2011
     2.   GRANULARITY=>SUBPARTITION and individual PARTNAME to ensure fresh statistics on subpartitions.

     3.   Subsequent gather with GRANULARITY=>AUTO when there is sufficient resource availability and the impact on
          application users is minimal to improve Global Statistics that have become stale without introducing inaccurate
          aggregated NDVs.

However, for this to work you need to have decided that this is the strategy you will use and apply it consistently because I’ve
seen a number of cases recently that indicate a lack of understanding of the way the aggregation process works leading to
Global Statistics that are months and years out of date because part 3 of the above process has been neglected. Worse still are
the systems I’ve seen where this has caused problems that have been solved temporarily by gathering statistics with a manual
one-off execution at AUTO or DEFAULT granularity, only for the problems to gradually reappear as the Global Statistics are
effectively fixed and become more and more stale over time.


_MINIMAL_STATS_AGGREGATION

The examples so far have been based on INSERTing new rows into subpartitions and although that's an approach that might be
used for some systems, the likelihood is that on modern DW systems that are constantly receiving new data feeds you will be
using partition exchange at some point. I'd say that there are two distinct approaches you are likely to use.

1) Create a temporary load table, load it with data, gather statistics on it and then exchange it with the relevant subpartition in
the target table. The advantage here is that the primary statistics gathering activity can take place during the load phase and
when the load table is subsequently exchanged with the correct subpartition of the target table, the new statistics will appear.

2) Create a temporary load table, load it with data, exchange it with the relevant subpartition and then gather statistics on the
subpartition. The potential problem here is that if it takes some time to gather the subpartition statistics then there will be a
time-lag between the data appearing and the relevant statistics.

I think that 1) appears to be the most sensible approach but can lead to difficulties caused by the hidden
_minimal_stats_aggregation parameter. This was introduced in Oracle 8i, not long after the initial introduction of Partitioning.
The default setting is a potentially confusing TRUE, which means that Oracle minimises automatic statistics aggregation activity
by disabling some activities. Changing this to FALSE re-enables those activities.

First of all I'll recreate TEST_TAB1 as it was at the end of Listing 1 (i.e. before any statistics have been gathered). Then I’ll add a
new partition (and, by implication, the related subpartitions) and create a seperate new table that I'll load the data into.

Listing 19 – Add new partition and create load table (PARTEX_MSA.SQL)

          SQL> ALTER TABLE TEST_TAB1
            2 ADD PARTITION P_20110209 VALUES LESS THAN (20110210);

          Table altered.

          SQL> DROP TABLE LOAD_TAB1;

          Table dropped.

          SQL> CREATE TABLE LOAD_TAB1
            2 AS SELECT * FROM TEST_TAB1 WHERE 1=0;

          Table created.

          SQL> CREATE UNIQUE INDEX LOAD_TAB1_IX1 ON LOAD_TAB1
            2 (REPORTING_DATE, SOURCE_SYSTEM, SEQ_ID)
            3 NOPARALLEL COMPRESS 1;



3 March 2012                                                            17                      Doug Burns – Hotsos Symposium 2011
           Index created.


For each of the different source systems, I will INSERT data to LOAD_TAB1, gather statistics on it and then exchange it with the
relevant subpartition of TEST_TAB1. (The following listing only shows the activities for the MOSCOW subpartition, but these are
repeated for the others, using the same LOAD_TAB1.)

Listing 20 – Load data and gather stats on load table and then exchange partition for MOSCOW (PARTEX_MSA.SQL)

           SQL> TRUNCATE TABLE LOAD_TAB1;

           Table truncated.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110209, 'MOSCOW', 400, 'P');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110209, 'MOSCOW', 600, 'P');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110209, 'MOSCOW', 900, 'Z');

           1 row created.

           SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1',
                  granularity => 'GLOBAL', method_opt => 'FOR ALL COLUMNS SIZE 1', estimate_percent => 1);

           PL/SQL procedure successfully completed.

           SQL> ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110209_MOSCOW WITH TABLE load_tab1;

           Table altered.

           SQL> ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20110209_MOSCOW REBUILD UNUSABLE LOCAL INDEXES;

           Table altered.

Having repeated the process for the different subpartitions, all of the P_20100209 subpartitions have statistics that were
swapped in as part of the partition exchange operation

Listing 21 – List subpartition statistics (PARTEX_MSA.SQL)

           SQL>   select table_name, subpartition_name, global_stats, last_analyzed, num_rows
             2    from user_tab_subpartitions
             3    where table_name='TEST_TAB1'
             4    and partition_name='P_20110209'
             5    order by subpartition_name;

           TABLE_NAME            SUBPARTITION_NAME           GLO    LAST_ANALYZED          NUM_ROWS
           --------------        ------------------          ---    -------------------- ----------
           TEST_TAB1             P_20110209_LONDON           NO     09-FEB-2011 17:59:17          3
           TEST_TAB1             P_20110209_MOSCOW           NO     09-FEB-2011 17:59:16          3
           TEST_TAB1             P_20110209_OTHERS           NO     09-FEB-2011 17:59:17          3
           TEST_TAB1             P_20110209_SYDNEY           NO     09-FEB-2011 17:59:17          3

The Subpartition statistics look good but we also need to check the Table and Partition statistics.

Listing 22 – List table and partition statistics (PARTEX_MSA.SQL)

           SQL> select table_name, global_stats, last_analyzed, num_rows
             2 from user_tables
             3 where table_name='TEST_TAB1';

           TABLE_NAME            GLO LAST_ANALYZED                     NUM_ROWS




3 March 2012                                                               18                           Doug Burns – Hotsos Symposium 2011
          -------------- --- -------------------- ----------
          TEST_TAB1      YES 09-FEB-2011 17:58:51         11

          SQL>   select table_name, partition_name, global_stats, last_analyzed, num_rows
            2    from user_tab_partitions
            3    where table_name='TEST_TAB1'
            4    order by partition_name;

          TABLE_NAME            PARTITION_NAME          GLO   LAST_ANALYZED          NUM_ROWS
          --------------        ----------------        ---   -------------------- ----------
          TEST_TAB1             P_20110131              YES   09-FEB-2011 17:58:51          0
          TEST_TAB1             P_20110201              YES   09-FEB-2011 17:58:51          4
          TEST_TAB1             P_20110202              YES   09-FEB-2011 17:58:51          2
          TEST_TAB1             P_20110203              YES   09-FEB-2011 17:58:51          1
          TEST_TAB1             P_20110204              YES   09-FEB-2011 17:58:51          2
          TEST_TAB1             P_20110205              YES   09-FEB-2011 17:58:51          1
          TEST_TAB1             P_20110206              YES   09-FEB-2011 17:58:51          1
          TEST_TAB1             P_20110207              YES   09-FEB-2011 17:58:51          0
          TEST_TAB1             P_20110209              NO

          9 rows selected.

The original global statistics on the Table are present but don’t appear to have been updated to reflect the new data but that
makes sense because as they are true global statistics they wouldn’t be replaced by the aggregation process. More puzzling is
the absence of statistics on P_20110209. Global statistics were not gathered on this new partition but all of the partitions
subpartitions have valid statistics so I would expect the aggregation process to aggregate statistics up to the Partition level.

Aggregation hasn’t occurred because it has been disabled by the default setting of TRUE for _mimimal_stats_aggregation. That
instructs Oracle to minimise aggregation operations and one of the ways it does so is not to aggregate statistics as a result of a
partition exchange operation but to leave you to do that manually by gathering stats on the target table partition. If we were to
                                            7
modify the parameter to a non-default value , we would see different behaviour. I ran the same script, but with this small
addition that changes the parameter setting at the session level.

Listing 23 – Modify _minimal_stats_aggregation (PARTEX_NON_MSA.SQL)

          SQL> -- Note - this is a non-default parameter value
          SQL> alter session set "_minimal_stats_aggregation"=FALSE;

          Session altered.

Which will change the end result to :-

Listing 24 – List table and partition statistics (PARTEX_NON_MSA.SQL)

          SQL> select table_name, global_stats, last_analyzed, num_rows
            2 from user_tables
            3 where table_name='TEST_TAB1';

          TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
          -------------- --- -------------------- ----------
          TEST_TAB1      YES 09-FEB-2011 17:58:51         11

          SQL>
          SQL>   select table_name, partition_name, global_stats, last_analyzed, num_rows
            2    from user_tab_partitions
            3    where table_name='TEST_TAB1'
            4    order by partition_name;




7
 Because _minimal_stats_aggregation is a hidden parameter, it should only really be set under the guidance of Oracle Support
or at your own risk.




3 March 2012                                                            19                      Doug Burns – Hotsos Symposium 2011
         TABLE_NAME        PARTITION_NAME       GLO   LAST_ANALYZED          NUM_ROWS
         --------------    ----------------     ---   -------------------- ----------
         TEST_TAB1         P_20110131           YES   09-FEB-2011 17:58:51          0
         TEST_TAB1         P_20110201           YES   09-FEB-2011 17:58:51          4
         TEST_TAB1         P_20110202           YES   09-FEB-2011 17:58:51          2
         TEST_TAB1         P_20110203           YES   09-FEB-2011 17:58:51          1
         TEST_TAB1         P_20110204           YES   09-FEB-2011 17:58:51          2
         TEST_TAB1         P_20110205           YES   09-FEB-2011 17:58:51          1
         TEST_TAB1         P_20110206           YES   09-FEB-2011 17:58:51          1
         TEST_TAB1         P_20110207           YES   09-FEB-2011 17:58:51          0
         TEST_TAB1         P_20110209           NO    09-FEB-2011 18:11:42         12

         9 rows selected.

Note that the Table-level global statistics have still not been updated by the aggregation process because they are true global
statistics but that there are aggregated statistics on the P_20100209 partition, because all of the relevant subpartitions do have
statistics. Unfortunately we have introduced a combination of aggregated and true global statistics which is likely to invite new
problems if we don’t gather true global statistics on the P_20110209 partition later.


MISSING GLOBAL STATISTICS

Another problem encountered in the past year was when migrating one of our main databases to version 11.2.0.1 and
restructuring a large partitioned table at the same time. Our approach was to gather the statistics at the Partition level, one
partition at a time so that when all partitions had statistics, they would be aggregated up to the table level as approximate
                 8
global statistics . DBMS_STATS was taking approximately three minutes to gather statistics on each partition where we'd
normally expect it to take a few seconds on the previous system. The two main symptoms were

    1.   Many executions of the following statement.

                  SELECT /*+ rule */ bucket_cnt, row_cnt, cache_cnt, null_cnt, timestamp#, sample_size,
                         MINIMUM, maximum, distcnt, lowval, hival, density, col#, spare1, spare2, avgcln
                  FROM hist_head$
                  WHERE obj# = :1 AND intcol# = :2

    2.   DML row locks on WRI$_OPTSTAT_HISTHEAD_HISTORY.

On deciding that the problem might be caused because the table had global statistics on the original database and not on the
new one, we gathered Global statistics using a very small sample size (less than 1%) to ensure the job would complete. Once the
global statistics were in place the partition statistics gathering performance reflected the few seconds that we expected. We
had to regather Global Statistics once all the Partition stats were in place and then we'd have to implement something to keep
the Global Statistics up to date on a regular basis.

ALTERNATIVE STRATEGIES

It should be clear by now that those responsible for maintaining statistics on VLDB are faced with a constant battle to ensure
that statistics are of a high enough quality to result in reasonable execution plans but can be gathered at a bearable cost. The
solutions described so far suffer from either being unworkable or generating low-quality, inaccurate statistics. This section looks
at alternative strategies to try to solve both problems.



8
 At this point you might wonder why we didn't just use 11g new features and gather statistics at the default granularity and
sample size. We did try that several times, but gave up after many hours of running.



3 March 2012                                                    20                          Doug Burns – Hotsos Symposium 2011
DYNAMIC SAMPLING

As a solution to the pervasive problem of missing statistics caused by inadequate statistics collection processes, Oracle 9.2
introduced the capability to gather statistics dynamically at the SQL statement optimisation stage. With the default
dynamic_sampling setting of 2, Oracle will perform dynamic sampling when one or more tables accessed in the SQL statement
have no statistics, including missing partition statistics. We have seen this process occur already (See ‘Missing Partition
Statistics’ above.) If Oracle is able to perform dynamic sampling in order to generate statistics for the objects being accessed,
then why bother gathering statistics at all, with all the hard work that entails?

Although on the face of it, Dynamic Sampling might seem the ideal solution, it has disadvantages

    1) Collection cost is now paid each time a query is optimized. Although this might appear to be a low cost relative to
       overall resource consumption of typical data warehouse queries, it is being paid repeatedly and generates a recurring
       load on the system.

    2) Because of the recurring cost, the collection needs to be fairly basic so the quality of the resulting statistics is likely to
       be lower. For example, the number of blocks sampled at dynamic_sampling=2 is a mere 64. This can be increased
       which will in turn increase the collection cost.

Dynamic Sampling should be considered a last resort for situations where correct statistics can’t be collected or haven’t been
collected in error.

SETTING STATISTICS

A common approach taken by some sites supporting very large databases is to avoid the time and resources spent gathering
object statistics by setting statistics directly. It is worth reconsidering why we are gathering statistics in the first place.

        To describe database objects and data to the optimiser by setting values in the data dictionary (e.g. Number of Rows,
         Number of Distinct Values, etc.)
        To help the optimiser identify the optimal execution plans

In the unlikely situation that we know the precise content of specific objects at all times then why trawl through the tables and
indexes examining the content when we could just set the relevant values in the data dictionary manually? DBMS_STATS has
several suitable procedures. (e.g. GET_COLUMN_STATS, SET_COLUMN_STATS and SET_TABLE_STATS.)

We could set the statistics for specific tables, indexes or partitions manually, lock the statistics and choose not to gather
statistics again for those objects if their content is unlikely to change again. This is fast and consumes minimal resources.
Knowledge of your system and data are an essential requirement although I would suggest that it's not as important that the
statistics are absolutely precise as that they describe the data accurately enough to lead to optimal plans.

There are a couple of cases where this might prove particularly useful.

        We add a new partition each month and so the partition starts out empty and fills up gradually over the course of the
         month. However, this is likely to lead to fluctuating execution plans as the contents of the partition change day by day.
         The CBO is designed to calculate new plans based on changing data distribution but that also implies plan instability
         which can lead to painful, unpredictable system performance. By setting statistics manually we can 'pretend' that we
         start off the month with a full partition and have the optimiser evaluate plans based on that fixed assumption.




3 March 2012                                                      21                          Doug Burns – Hotsos Symposium 2011
          We add new partitions very frequently and have very little time to gather statistics before users are likely to report
           against the data. It might prove useful to set some approximate synthetic stats quickly to get reasonable plans and
           then possibly gather more detailed statistics later when time allows. However, the change from the synthetic statistics
           to properly gathered statistics may also lead to plan instability.


SETTING STATISTICS SUMMARY

Positives

          Very fast and low resource usage method for setting statistics on new partitions.

          Potential improvements to plan stability when accessing time-period partitions that start off empty and are filled over
           time.

Negatives

          Does not eliminate the difficulty in maintaining accurate Global Statistics although these could be set directly.

          Detailed knowledge of system and data profile is essential and choosing the most appropriate values for new partitions
           may prove particularly difficult on some systems.

          Significant code needs to be developed, particularly when working with composite partitioned objects, which increases
           the possibility of human error.

COPYING STATISTICS

                                                                                                                        9
An extension of the concept of setting statistics for new partitions directly was introduced in Oracle 10.2.0.4 .
DBMS_STATS.COPY_TABLE_STATS is designed to achieve a similar effect to setting statistics manually at a similarly low cost but
with a little more intelligence. For example, when adding a new partition for each new REPORTING_DATE on our example table,
maybe it's reasonable to assume that the data volumes and distribution of values are similar to the previous REPORTING_DATE?
If so, why not avoid setting fixed statistics based on fixed assumptions about the contents of a partition, but copy the statistics
                                                                                          th
from the most recent partition to the new partition or from the previous Monday or 15 of the month if the data follows a
periodic pattern? The inclusion of the SCALING_FACTOR parameter could also help with growing data volumes.

Here is an example of how this might look, based on my example table. After the initial creation of the table (listings 1 and 2),
the table, partition and subpartition statistics look like this.

Listing 25 – List table and partition statistics (DEFAULT.SQL) Need to update after previous sections

           SQL> select table_name, global_stats, last_analyzed, num_rows
             2 from user_tables
             3 where table_name='TEST_TAB1';

           TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
           -------------- --- -------------------- ----------
           TEST_TAB1      YES 11-FEB-2011 22:22:20         11

           SQL> select        table_name, partition_name, global_stats, last_analyzed, num_rows




9
    Although there is evidence of COPY_TABLE_STATS being available in previous versions, it is not supported until 10.2.0.4




3 March 2012                                                                   22                       Doug Burns – Hotsos Symposium 2011
            2   from user_tab_partitions
            3   where table_name='TEST_TAB1'
            4   order by partition_name;

          TABLE_NAME          PARTITION_NAME         GLO   LAST_ANALYZED          NUM_ROWS
          --------------      ----------------       ---   -------------------- ----------
          TEST_TAB1           P_20110131             YES   11-FEB-2011 22:22:20          0
          TEST_TAB1           P_20110201             YES   11-FEB-2011 22:22:20          4
          TEST_TAB1           P_20110202             YES   11-FEB-2011 22:22:20          2
          TEST_TAB1           P_20110203             YES   11-FEB-2011 22:22:20          1
          TEST_TAB1           P_20110204             YES   11-FEB-2011 22:22:20          2
          TEST_TAB1           P_20110205             YES   11-FEB-2011 22:22:20          1
          TEST_TAB1           P_20110206             YES   11-FEB-2011 22:22:20          1
          TEST_TAB1           P_20110207             YES   11-FEB-2011 22:22:20          0

          8 rows selected.

Next I add a new partition (and the related subpartitions using the subpartition template) to contain rows with a
REPORTING_DATE of 20100211, then empty LOAD_TAB1 and insert some appropriate rows into it before using partition
exchange to load the data into the target TEST_TAB1.

Listing 26 – Add partition and load data for MOSCOW and 20110211 (COPY_STATS.SQL)

          SQL> ALTER TABLE TEST_TAB1
            2 ADD PARTITION P_20110211 VALUES LESS THAN (20110212);

          Table altered.

          SQL> TRUNCATE TABLE LOAD_TAB1;

          Table truncated.

          SQL> exec dbms_stats.delete_table_stats('TESTUSER', 'LOAD_TAB1');

          PL/SQL procedure successfully completed.

          SQL> INSERT INTO LOAD_TAB1 VALUES (20110211, 'MOSCOW', 1000, 'P');

          1 row created.

          SQL> INSERT INTO LOAD_TAB1 VALUES (20110211, 'MOSCOW', 30000, 'P');

          1 row created.

          SQL> INSERT INTO LOAD_TAB1 VALUES (20110211, 'MOSCOW', 2000, 'P');

          1 row created.

          SQL> INSERT INTO LOAD_TAB1 VALUES (20110211, 'MOSCOW', 10000, 'N');

          1 row created.

          SQL> INSERT INTO LOAD_TAB1 VALUES (20110211, 'MOSCOW', 2400, 'P');

          1 row created.

          SQL> INSERT INTO LOAD_TAB1 VALUES (20110211, 'MOSCOW', 500, 'P');

          1 row created.

          SQL> INSERT INTO LOAD_TAB1 VALUES (20110211, 'MOSCOW', 1200, 'P');

          1 row created.

          SQL> INSERT INTO LOAD_TAB1 VALUES (20110211, 'MOSCOW', 400, 'P');

          1 row created.

          SQL> INSERT INTO LOAD_TAB1 VALUES (20110211, 'MOSCOW', 600, 'P');




3 March 2012                                                           23                    Doug Burns – Hotsos Symposium 2011
          1 row created.

          SQL> INSERT INTO LOAD_TAB1 VALUES (20110211, 'MOSCOW', 700, 'Z');

          1 row created.

          SQL> COMMIT;

          Commit complete.

          SQL> ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110211_MOSCOW WITH TABLE load_tab1;

          Table altered.

          SQL> ALTER TABLE test_tab1 MODIFY SUBPARTITION                            P_20110211_MOSCOW REBUILD UNUSABLE LOCAL
          INDEXES;

          Table altered.

There are 10 rows in the new P_20110211_MOSCOW subpartition and no data in the other P_20110211 subpartitions. Next I
copy the statistics from the P_20100201 partition into the new partition. i.e. I am performing a Partition-level copy.

Listing 27 – Copy Statistics from P_20110201 partition to P_20110211 and check results (COPY_STATS.SQL)

          SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1',
                                srcpartname => 'P_20110201', dstpartname => 'P_20110211');

          PL/SQL procedure successfully completed.

          SQL>
          SQL> select table_name, global_stats, last_analyzed, num_rows
            2 from user_tables
            3 where table_name='TEST_TAB1';

          TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
          -------------- --- -------------------- ----------
          TEST_TAB1      YES 12-FEB-2011 10:16:54         11

          SQL>
          SQL>   select table_name, partition_name, global_stats, last_analyzed, num_rows
            2    from user_tab_partitions
            3    where table_name='TEST_TAB1'
            4    order by partition_name;

          TABLE_NAME           PARTITION_NAME          GLO   LAST_ANALYZED          NUM_ROWS
          --------------       ----------------        ---   -------------------- ----------
          TEST_TAB1            P_20110131              YES   12-FEB-2011 10:16:54          0
          TEST_TAB1            P_20110201              YES   12-FEB-2011 10:16:54          4
          TEST_TAB1            P_20110202              YES   12-FEB-2011 10:16:54          2
          TEST_TAB1            P_20110203              YES   12-FEB-2011 10:16:54          1
          TEST_TAB1            P_20110204              YES   12-FEB-2011 10:16:54          2
          TEST_TAB1            P_20110205              YES   12-FEB-2011 10:16:54          1
          TEST_TAB1            P_20110206              YES   12-FEB-2011 10:16:54          1
          TEST_TAB1            P_20110207              YES   12-FEB-2011 10:16:54          0
          TEST_TAB1            P_20110211              YES   12-FEB-2011 10:16:54          4

          9 rows selected.

          SQL>   select table_name, subpartition_name, global_stats, last_analyzed, num_rows
            2    from user_tab_subpartitions
            3    where table_name='TEST_TAB1'
            4    and partition_name IN ('P_20110201', 'P_20110211')
            5    order by subpartition_name;

          TABLE_NAME           SUBPARTITION_NAME          GLO   LAST_ANALYZED          NUM_ROWS
          --------------       ------------------         ---   -------------------- ----------
          TEST_TAB1            P_20110201_LONDON          YES   12-FEB-2011 10:16:54          1
          TEST_TAB1            P_20110201_MOSCOW          YES   12-FEB-2011 10:16:54          1
          TEST_TAB1            P_20110201_OTHERS          YES   12-FEB-2011 10:16:54          1
          TEST_TAB1            P_20110201_SYDNEY          YES   12-FEB-2011 10:16:54          1



3 March 2012                                                              24                              Doug Burns – Hotsos Symposium 2011
          TEST_TAB1          P_20110211_LONDON       NO
          TEST_TAB1          P_20110211_MOSCOW       NO
          TEST_TAB1          P_20110211_OTHERS       NO
          TEST_TAB1          P_20110211_SYDNEY       NO

          8 rows selected.

By performing a Partition-level copy of statistics from P_20110201 to P_20110211 I have new statistics on P_20110211 but have
no statistics on the new Subpartitions yet. However, that is easily solved by performing four additional copies at the
Subpartition-level for each of the Subpartitions. Also, note that the global statistics at the Table level have not been updated so
copying stats at the Partition and Subpartition levels does not eliminate the requirement to maintain global statistics. That is
just the beginning of the flaws in the copy_table_stats process.


COPY_TABLE_STATS BUGS AND PATCHES

One of the barriers to more rapid uptake of COPY_TABLE_STATS has been the number of different related bugs and one-off
patches required. Based on my own experience and discussions with others, this is true of DBMS_STATS in general over recent
releases as much work has gone into improving statistics collection options, but seems particularly relevant to
COPY_TABLE_STATS. I will make the assumption that we are using a minimum 10.2.0.4 patchset with no one-off patches
required as this is the first reasonable version to use and the version used for the examples so far.

Subpartition issues

When I executed the partition-level copy in Listing 27 I actually cheated a little and executed the script against a 10.2.0.5
database. Had I executed it on an unpatched 10.2.0.4 database, the output would have looked like this.

Listing 28 – COPY_STATS bug number 8318020 (COPY_STATS.SQL)

          SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
          'P_20110201', dstpartname => 'P_20110211');
          BEGIN dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
          'P_20110201', dstpartname => 'P_20110211'); END;

          *
          ERROR at line 1:
          ORA-06533: Subscript beyond count
          ORA-06512: at "SYS.DBMS_STATS", line 17408
          ORA-06512: at line 1

                                          10
This is caused by bug number 8318020 and only applies to composite partitioned objects (i.e. those with subpartitions) when
copying statistics at the Partition level. To fix this bug, I’d recommend one of the following three options, all of which will fix the
bug. For my examples, I used option 2.

     1.   Apply Merge Label Request (MLR) 8866627 on top of 10.2.0.4, which also includes other related bug fixes

     2.   Apply patch set 10.2.0.5



10
  The information on My Oracle Support is confusing as there are a number of apparently unrelated issues which are described
as being due to bug number 8318020. Regardless, there are so many issues that it’s prudent to apply suitable patches in
advance rather than uncovering more!




3 March 2012                                                      25                           Doug Burns – Hotsos Symposium 2011
     3.    Upgrade to 11.2.0.2

A workaround is to only copy statistics at the Subpartition level as in the next example but then we would need to maintain
statistics at the Partition level using the traditional gathering process.

High/Low Partition Key Values

This is probably the best known of the statistics copying bugs in Oracle 10.2.0.4. I will add a new P_20110212 paritition to the
existing table and copy statistics from the P_20110201 partition. This time I perform the copy for each of the subpartitions at
the Subpartition level to avoid the previous bug.

Listing 29 – List table and partition statistics (COPY_STATS.SQL)

           SQL> ALTER TABLE TEST_TAB1
             2 ADD PARTITION P_20110212 VALUES LESS THAN (20110213);

           Table altered.

           SQL> TRUNCATE TABLE LOAD_TAB1;

           Table truncated.

           SQL> exec dbms_stats.delete_table_stats('TESTUSER', 'LOAD_TAB1');

           PL/SQL procedure successfully completed.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110212, 'MOSCOW', 1000, 'P');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110212, 'MOSCOW', 30000, 'P');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110212, 'MOSCOW', 2000, 'P');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110212, 'MOSCOW', 10000, 'N');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110212, 'MOSCOW', 2400, 'P');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110212, 'MOSCOW', 500, 'P');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110212, 'MOSCOW', 1200, 'P');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110212, 'MOSCOW', 400, 'P');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110212, 'MOSCOW', 600, 'P');

           1 row created.

           SQL> INSERT INTO LOAD_TAB1 VALUES (20110212, 'MOSCOW', 700, 'Z');

           1 row created.




3 March 2012                                                        26                     Doug Burns – Hotsos Symposium 2011
           SQL> COMMIT;

           Commit complete.

           SQL> ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110212_MOSCOW WITH TABLE load_tab1;

           Table altered.

           SQL> ALTER TABLE test_tab1 MODIFY SUBPARTITION                         P_20110212_MOSCOW REBUILD UNUSABLE LOCAL
           INDEXES;

           Table altered.

           SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
           'P_20110201_OTHERS', dstpartname => 'P_20110212_OTHERS');

           PL/SQL procedure successfully completed.

           SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
           'P_20110201_MOSCOW', dstpartname => 'P_20110212_MOSCOW');

           PL/SQL procedure successfully completed.

           SQL>
           SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
           'P_20110201_LONDON', dstpartname => 'P_20110212_LONDON');

           PL/SQL procedure successfully completed.

           SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
           'P_20110201_SYDNEY', dstpartname => 'P_20110212_SYDNEY');

           PL/SQL procedure successfully completed.

How do the resulting partition and subpartition statistics look?

Listing 30 – List table and partition statistics (COPY_STATS.SQL)

           SQL>   select table_name, partition_name, global_stats, last_analyzed, num_rows
             2    from user_tab_partitions
             3    where table_name='TEST_TAB1'
             4    order by partition_name;

           TABLE_NAME            PARTITION_NAME           GLO   LAST_ANAL   NUM_ROWS
           --------------        ----------------         ---   --------- ----------
           TEST_TAB1             P_20110131               YES   12-FEB-11          0
           TEST_TAB1             P_20110201               YES   12-FEB-11          4
           TEST_TAB1             P_20110202               YES   12-FEB-11          2
           TEST_TAB1             P_20110203               YES   12-FEB-11          1
           TEST_TAB1             P_20110204               YES   12-FEB-11          2
           TEST_TAB1             P_20110205               YES   12-FEB-11          1
           TEST_TAB1             P_20110206               YES   12-FEB-11          1
           TEST_TAB1             P_20110207               YES   12-FEB-11          0
           TEST_TAB1             P_20110211               YES   12-FEB-11          4
           TEST_TAB1             P_20110212               NO

           10 rows selected.

           SQL>   select table_name, subpartition_name, global_stats, last_analyzed, num_rows
             2    from user_tab_subpartitions
             3    where table_name='TEST_TAB1'
             4    and partition_name IN ('P_20110201', 'P_20110212')
             5    order by subpartition_name;

           TABLE_NAME            SUBPARTITION_NAME           GLO    LAST_ANAL   NUM_ROWS
           --------------        ------------------          ---    --------- ----------
           TEST_TAB1             P_20110201_LONDON           YES    12-FEB-11          1
           TEST_TAB1             P_20110201_MOSCOW           YES    12-FEB-11          1
           TEST_TAB1             P_20110201_OTHERS           YES    12-FEB-11          1
           TEST_TAB1             P_20110201_SYDNEY           YES    12-FEB-11          1



3 March 2012                                                               27                     Doug Burns – Hotsos Symposium 2011
          TEST_TAB1            P_20110212_LONDON          YES   12-FEB-11                 1
          TEST_TAB1            P_20110212_MOSCOW          YES   12-FEB-11                 1
          TEST_TAB1            P_20110212_OTHERS          YES   12-FEB-11                 1
          TEST_TAB1            P_20110212_SYDNEY          YES   12-FEB-11                 1

          8 rows selected.

There are no statistics on the new P_20110212 partition because we didn’t copy statistics at the partition level but the
subpartition statistics appear to have been copied correctly. However, when we drill down into the column statistics for the
new subpartitions there is a clear problem.

Listing 31 – List column statistics for P_20110201 and P_20110212 subpartitions (COPY_STATS.SQL)

          SQL>   select a.column_name, a.subpartition_name, a.num_distinct,
            2      display_raw(a.low_value,b.data_type) as low_val,
            3      display_raw(a.high_value,b.data_type) as high_val
            4    from dba_subpart_col_statistics a, dba_tab_cols b
            5    where a.table_name='TEST_TAB1'
            6    and a.table_name=b.table_name
            7    and a.column_name=b.column_name
            8    and (a.subpartition_name LIKE 'P_20110201%'
            9    or a.subpartition_name LIKE 'P_20110212%')
           10    order by a.column_name, a.subpartition_name;

          COLUMN_NAME                           SUBPARTITION_NAME NUM_DISTINCT LOW_VAL               HIGH_VAL
          --------------------------            ------------------ ------------ -------------------- --------------------
          REPORTING_DATE                        P_20110201_LONDON             1 20110201             20110201
          REPORTING_DATE                        P_20110201_MOSCOW             1 20110201             20110201
          REPORTING_DATE                        P_20110201_OTHERS             1 20110201             20110201
          REPORTING_DATE                        P_20110201_SYDNEY             1 20110201             20110201
          REPORTING_DATE                        P_20110212_LONDON             1 20110201             20110201
          REPORTING_DATE                        P_20110212_MOSCOW             1 20110201             20110201
          REPORTING_DATE                        P_20110212_OTHERS             1 20110201             20110201
          REPORTING_DATE                        P_20110212_SYDNEY             1 20110201             20110201
          SEQ_ID                                P_20110201_LONDON             1 500                  500
          SEQ_ID                                P_20110201_MOSCOW             1 1000                 1000
          SEQ_ID                                P_20110201_OTHERS             1 400                  400
          SEQ_ID                                P_20110201_SYDNEY             1 700                  700
          SEQ_ID                                P_20110212_LONDON             1 500                  500
          SEQ_ID                                P_20110212_MOSCOW             1 1000                 1000
          SEQ_ID                                P_20110212_OTHERS             1 400                  400
          SEQ_ID                                P_20110212_SYDNEY             1 700                  700
          SOURCE_SYSTEM                         P_20110201_LONDON             1 LONDON               LONDON
          SOURCE_SYSTEM                         P_20110201_MOSCOW             1 MOSCOW               MOSCOW
          SOURCE_SYSTEM                         P_20110201_OTHERS             1 DALLAS               DALLAS
          SOURCE_SYSTEM                         P_20110201_SYDNEY             1 SYDNEY               SYDNEY
          SOURCE_SYSTEM                         P_20110212_LONDON             1 LONDON               LONDON
          SOURCE_SYSTEM                         P_20110212_MOSCOW             1 MOSCOW               MOSCOW
          SOURCE_SYSTEM                         P_20110212_OTHERS             1 DALLAS               DALLAS
          SOURCE_SYSTEM                         P_20110212_SYDNEY             1 SYDNEY               SYDNEY
          STATUS                                P_20110201_LONDON             1 P                    P
          STATUS                                P_20110201_MOSCOW             1 P                    P
          STATUS                                P_20110201_OTHERS             1 P                    P
          STATUS                                P_20110201_SYDNEY             1 P                    P
          STATUS                                P_20110212_LONDON             1 P                    P
          STATUS                                P_20110212_MOSCOW             1 P                    P
          STATUS                                P_20110212_OTHERS             1 P                    P
          STATUS                                P_20110212_SYDNEY             1 P                    P

          32 rows selected.

As we might expect, Oracle has merely copied the high and low values for each column from the P_20110201 subpartitions to
the equivalent P_20110212 subpartitions. How effective this is likely to be will be system and data-dependant and is an
important consideration when choosing whether copying statistics is the best solution. For example, the copy process works
well for SOURCE_SYSTEM but perhaps not for the OTHERS list partition that can store multiple SOURCE_SYSTEMs if the range of
SOURCE_SYSTEMS is not static.



3 March 2012                                                              28                       Doug Burns – Hotsos Symposium 2011
However, there is one area where we could reasonably expect the copy process to show a bit more intelligence. I have
highlighted the most relevant rows. Despite the fact that Oracle has more than enough information to determine the Low and
High Values of the REPORTING_DATE partition key column for the new subpartitions because there is a hard limit on the values
of REPORTING_DATE that can be stored in these subpartitions, it has simply copied those values from the P_20110201 partition.

This issue is also covered by bug 8318020 so can be addressed by any of the three methods described in the previous section.
For example, if I run the previous example on a 10.2.0.5 database, the partition key high and low values are updated
intelligently.

Listing 32 – COPY_STATS bug number 8318020 (COPY_STATS.SQL)

          SQL>   select a.column_name, a.partition_name, a.num_distinct,
            2      display_raw(a.low_value,b.data_type) as low_val,
            3      display_raw(a.high_value,b.data_type) as high_val
            4    from user_part_col_statistics a, user_tab_cols b
            5    where a.table_name='TEST_TAB1'
            6    and a.table_name=b.table_name
            7    and a.column_name=b.column_name
            8    and a.partition_name IN ('P_20110201', 'P_20110211')
            9    order by a.column_name, a.partition_name;

          COLUMN_NAME                            PARTITION_NAME   NUM_DISTINCT LOW_VAL              HIGH_VAL
          --------------------------             ---------------- ------------ -------------------- --------------------
          REPORTING_DATE                         P_20110201                  1 20110201             20110201
          REPORTING_DATE                         P_20110211                  1 20110208             20110212
          SEQ_ID                                 P_20110201                  4 400                  1000
          SEQ_ID                                 P_20110211                  4 400                  1000
          SOURCE_SYSTEM                          P_20110201                  4 DALLAS               SYDNEY
          SOURCE_SYSTEM                          P_20110211                  4 DALLAS               SYDNEY
          STATUS                                 P_20110201                  1 P                    P
          STATUS                                 P_20110211                  1 P                    P

          8 rows selected.

The observant might have noticed, however, that I have listed partition statistics here and not the subpartition statistics we’ve
been working with so far in this example. The simple reason is that, even in 10.2.0.5 and 11.2.0.1, copying subpartition statistics
does not update the high and low value for the range partition key that is implicitly defined by the parent partition. Here is a
sample of the results after copying subpartition statistics.

Listing 33 – Sample of subpartition column statistics after Subpartition-level copy (COPY_STATS.SQL)

          SQL>   select a.column_name, a.subpartition_name, a.num_distinct,
            2      display_raw(a.low_value,b.data_type) as low_val,
            3      display_raw(a.high_value,b.data_type) as high_val
            4    from dba_subpart_col_statistics a, dba_tab_cols b
            5    where a.table_name='TEST_TAB1'
            6    and a.table_name=b.table_name
            7    and a.column_name=b.column_name
            8    and (a.subpartition_name LIKE 'P_20110201%'
            9    or a.subpartition_name LIKE 'P_20110212%')
           10    order by a.column_name, a.subpartition_name;

          COLUMN_NAME                            SUBPARTITION_NAME NUM_DISTINCT LOW_VAL               HIGH_VAL
          --------------------------             ------------------ ------------ -------------------- --------------------
          REPORTING_DATE                         P_20110201_LONDON             1 20110201             20110201
          REPORTING_DATE                         P_20110201_MOSCOW             1 20110201             20110201
          REPORTING_DATE                         P_20110212_LONDON             1 20110201             20110201
          REPORTING_DATE                         P_20110212_MOSCOW             1 20110201             20110201


ORA-07445 / ORA-03113 while copying List partition statistics




3 March 2012                                                                29                         Doug Burns – Hotsos Symposium 2011
The next bug occurs when copying statistics between List partitions (or, in our case, subpartitions) If I run script from listing 29
on an Oracle 10.2.0.5 or 11.2.0.1 database, I see different results when I copy the statistics at the Subpartition-level.

Listing 34 – Sample of subpartition column statistics after Subpartition-level copy (COPY_STATS.SQL)

          SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1',
                         srcpartname => 'P_20110201_MOSCOW', dstpartname => 'P_20110212_MOSCOW');

          PL/SQL procedure successfully completed.

          SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1',
                         srcpartname => 'P_20110201_LONDON', dstpartname => 'P_20110212_LONDON');

          PL/SQL procedure successfully completed.

          SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1',
                         srcpartname => 'P_20110201_SYDNEY', dstpartname => 'P_20110212_SYDNEY');

          PL/SQL procedure successfully completed.

          SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1',
                         srcpartname => 'P_20110201_OTHERS', dstpartname => 'P_20110212_OTHERS');

          BEGIN dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
          'P_20110201_OTHERS', dstpartname => 'P_20110212_OTHERS'); END;

          *
          ERROR at line 1:
          ORA-03113: end-of-file on communication channel


My session is disconnected when I copy the final P_20110212 subpartition statistics or at least that is how it appears. If I trace
the session, I can see the point of failure and that the underlying error is an ORA-07445 in qospMinMaxPartCol.

Listing 35 – COPY_STATS bug number 10268597 (COPY_STATS.SQL)

          PARSING IN CURSOR #44 len=38 dep=1 uid=0 oct=7 lid=0 tim=1242214568228618 hv=624276838 ad='a03acac0'
          delete from ind_online$ where obj#= :1
          2END OF STMT
          PARSE #44:c=0,e=43,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1242214568228606
          EXEC #44:c=0,e=141,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=4,tim=1242214568228961
          STAT #44 id=1 cnt=0 pid=0 pos=1 obj=0 op='DELETE IND_ONLINE$ (cr=3 pr=0 pw=0 time=101 us)'
          STAT #44 id=2 cnt=0 pid=1 pos=1 obj=731 op='TABLE ACCESS FULL IND_ONLINE$ (cr=3 pr=0 pw=0 time=66
          us)'
          EXEC #7:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1242214568229627
          FETCH #7:c=0,e=158,p=0,cr=12,cu=0,mis=0,r=1,dep=1,og=1,tim=1242214568229851
          EXEC #40:c=0,e=167,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1242214568230144
          FETCH #40:c=0,e=851,p=0,cr=13,cu=0,mis=0,r=1,dep=1,og=1,tim=1242214568231118
          EXEC #38:c=0,e=43,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1242214568231322
          FETCH #38:c=0,e=99,p=0,cr=12,cu=0,mis=0,r=1,dep=1,og=1,tim=1242214568231540
          Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x0,
                                          PC: [0x1a1baf7, qospMinMaxPartCol()+3135]
          *** 2010-04-23 14:01:57.888
          ksedmp: internal or fatal error
          ORA-07445: exception encountered: core dump [qospMinMaxPartCol()+3135]
                                            [SIGSEGV] [Address not mapped to object] [0x000000000] [] []
          Current SQL statement for this session:
          BEGIN dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1',
                                srcpartname => 'P_20100209_OTHERS', dstpartname => 'P_20100211_OTHERS'); END;
          ----- PL/SQL Call Stack -----
            object      line object
            handle    number name
          0xa0625b88       830 package body SYS.DBMS_STATS
          0xa0625b88     18292 package body SYS.DBMS_STATS
          0x97cbbe60         1 anonymous block
          ----- Call Stack Trace -----
          calling              call     entry                argument values in hex
          location             type     point                (? means dubious value)
          -------------------- -------- -------------------- ----------------------------



3 March 2012                                                                30                         Doug Burns – Hotsos Symposium 2011
         ksedst()+31                call       ksedst1()                000000000 ? 000000001 ?
                                                                        2B3BC0FACD50 ? 2B3BC0FACDB0       ?
                                                                        2B3BC0FACCF0 ? 000000000 ?
         ksedmp()+610               call       ksedst()                 000000000 ? 000000001 ?
                                                                        2B3BC0FACD50 ? 2B3BC0FACDB0       ?
                                                                        2B3BC0FACCF0 ? 000000000 ?
         ssexhd()+629               call       ksedmp()                 000000003 ? 000000001 ?
                                                                        2B3BC0FACD50 ? 2B3BC0FACDB0       ?
                                                                        2B3BC0FACCF0 ? 000000000 ?
         __restore_rt()+0           call       ssexhd()                 00000000B ? 2B3BC0FADD70 ?
                                                                        2B3BC0FADC40 ? 2B3BC0FACDB0       ?
                                                                        2B3BC0FACCF0 ? 000000000 ?
         qospMinMaxPartCol()        signal     __restore_rt()           0068986E0 ? 000000000 ?
         +3135                                                          000000001 ? 000000000 ?
                                                                        000000000 ? 0000000C2 ?
         spefcmpa()+687             call       qospMinMaxPartCol()      7FFF31FFEDA0 ? 2B3BC13AABB0       ?
                                                                        000000000 ? 2B3BC13AAB70 ?
                                                                        000000000 ? 2B3BC13AAB30 ?
         spefmccallstd()+218        call       spefcmpa()               2B3BC13AAA90 ? 2B3BC13AAAE0       ?
                                                                        2B3BC129E500 ? 2B3BC13AAB70       ?
                                                                        000000000 ? 2B3BC13AAB30 ?
         pextproc()+41              call       spefmccallstd()          7FFF31FFEE00 ? 7FFF31FFEA80       ?


When I investigated whether this was still an error in 11.2 so that I might file a Service Request with Oracle Support, I soon
discovered the real reason for this is not an error in a background automatic aggregation process (as I initially suspected
because it occurs when the final subpartition statistics are copied) but because the P_20110212_OTHERS subpartition is a
DEFAULT list subpartition. Recall our Subpartition Template

         SUBPARTITION TEMPLATE
           (SUBPARTITION MOSCOW       VALUES   ('MOSCOW')   TABLESPACE   TEST_DATA,
            SUBPARTITION LONDON       VALUES   ('LONDON')   TABLESPACE   TEST_DATA,
            SUBPARTITION SYDNEY       VALUES   ('SYDNEY')   TABLESPACE   TEST_DATA,
            SUBPARTITION OTHERS       VALUES   (DEFAULT)    TABLESPACE   TEST_DATA)

This is bug number 10268597 - ORA-7445 (qospMinMaxPartCol) [ID 1288540.1] in 10.2.0.5 and 11.2.0.2 which is marked as
fixed in 11.2.0.3 and 12.1.0.0 so all current installations will encounter this bug unless Oracle provide a backport one-off patch.


COPY_TABLE_STATS SUMMARY

After my initial excitement about the introduction of COPY_TABLE_STATS as a method to simplify the copying of statistics to
future partitions and include some intelligence to the process, I find myself underwhelmed .

Positives

           Very fast and low resource method for setting statistics on new partitions

        Potential improvements to plan stability when accessing time-period partitions that are filled over time.

Negatives

        Many bugs and related patches although this can be largely resolved by using 10.2.0.5 or 11.2

        Does not eliminate the difficulty in maintaining accurate Global Statistics. Whilst copying statistics is not designed to
         solve that problem, collecting partition and subpartition statistics is a much smaller part of the overall statistics
         maintenance challenge because the volume of data accessed to maintain statistics on individual partitions is much
         smaller and does not tend to grow over time as much as full table scans to gather global statistics.




3 March 2012                                                     31                          Doug Burns – Hotsos Symposium 2011
         Does not work well with composite partitioned tables. Regardless of whether Oracle feel people should use
          subpartition statistics or not, why support copying statistics between subpartitions whilst

                o    Not setting the High and Low Values of the parent Range partition key columns.

                o    Not copying subpartition statistics when the parent partition statistics are copied or at least offering the
                     option

         Does not work in current releases with List Partitioning where there is a DEFAULT partition

APPROX_GLOBAL AND PARTITION

Although COPY_TABLE_STATS does not help with Global Statistics maintenance, Oracle introduced another new feature to 10.2
that might. The new GRANULARITY option is ‘APPROX_GLOBAL AND PARTITION’ which is similar to ‘GLOBAL AND PARTITION’
but uses the aggregation process to generate Approximate Global Statistics and therefore does not need to perform a full table
scan. Note, however that it does perform full scans of any indexes to maintain true global statistics on them. There are several
interesting notes in the DBMS_STATS package specification.

--        'APPROX_GLOBAL AND PARTITION' - This option is similar to
--           'GLOBAL AND PARTITION'. But the global statistics are aggregated
--            from partition level statistics. It will aggregate all statistics
--            except number of distinct values for columns and number of distinct
--            keys of indexes.
--            The existing histograms of the columns at the table level
--            are also aggregated.The global statistics are gathered
--            (i.e., going back to GLOBAL AND PARTITION behaviour)
--            if partname argument is null or if the aggregation cannot be done
--            e.g., statistics of one of the partitions is missing.
--            This option is useful when you collect statistics for a new
--            partition added into a range partitioned table (for example,
--            a table partitioned by month).
--            The new data in the partition makes the global statistics stale
--            (especially the min/max values of the partitioning column).
--            This stale global statistics may cause suboptimal plans.
--            In this scenario, users can collect statistics for the newly added
--            partition with 'APPROX_GLOBAL AND PARTITION' option so that the
--            global statistics will reflect the newly added range.
--            This option will take less time than 'GLOBAL AND PARTITION' option
--            since the global statistics are aggregated from underlying
--            partition level statistics.
--            Note that, if you are using APPROX_GLOBAL AND PARTITION,
--            you still need to collect global statistics (with granularity =
--            'GLOBAL' option)
--            when there is substantial amount of change at the table level.
--            For example you added 10% more data to the table.
--            This is needed to get the correct number of distinct values/keys
--            statistic at table level.

I’ll be running the rest of the examples against an Oracle 11.2.0.1 database, where APPROX_GLOBAL AND PARTITION is a valid
option although, for reasons that will become apparent, this option is only an appropriate solution for Oracle 10.2 databases.

First the table is truncated, rows inserted for reporting_date 20110201 and statistics gathered using the default GRANULARITY
=> ‘ALL’ which will result in true Global Statistics on all levels of TEST_TAB1.

Listing 36 – List table and partition statistics (APPROX_GLOBAL2.SQL)

          SQL> TRUNCATE TABLE TEST_TAB1;

          Table truncated.

          SQL> exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')



3 March 2012                                                            32                      Doug Burns – Hotsos Symposium 2011
       PL/SQL procedure successfully completed.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 1000, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 2000, 'U');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 4000, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 60000, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 100, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 10, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 300, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'LONDON', 5000, 'U');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'DALLAS', 6000, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'SYDNEY', 7000, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 8000, 'P');

       1 row created.

       SQL> COMMIT;

       Commit complete.

       SQL> exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1');

       PL/SQL procedure successfully completed.

       SQL> select table_name, global_stats, last_analyzed, num_rows
         2 from user_tables
         3 where table_name='TEST_TAB1';

       TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
       -------------- --- -------------------- ----------
       TEST_TAB1      YES 06-FEB-2011 19:26:50         11

       SQL>    select table_name, partition_name, global_stats, last_analyzed, num_rows
         2     from user_tab_partitions
         3     where table_name='TEST_TAB1'
         4     order by partition_name;

       TABLE_NAME     PARTITION_NAME   GLO LAST_ANALYZED          NUM_ROWS
       -------------- ---------------- --- -------------------- ----------
       TEST_TAB1      P_20110131       YES 06-FEB-2011 19:26:50          0



3 March 2012                                           33                    Doug Burns – Hotsos Symposium 2011
          TEST_TAB1             P_20110201               YES   06-FEB-2011   19:26:50      11
          TEST_TAB1             P_20110202               YES   06-FEB-2011   19:26:50       0
          TEST_TAB1             P_20110203               YES   06-FEB-2011   19:26:50       0
          TEST_TAB1             P_20110204               YES   06-FEB-2011   19:26:50       0
          TEST_TAB1             P_20110205               YES   06-FEB-2011   19:26:50       0
          TEST_TAB1             P_20110206               YES   06-FEB-2011   19:26:50       0
          TEST_TAB1             P_20110207               YES   06-FEB-2011   19:26:50       0
          TEST_TAB1             P_20110209               YES   06-FEB-2011   19:26:50       0

          9 rows selected.

          SQL>    select table_name, subpartition_name, global_stats, last_analyzed, num_rows
            2     from user_tab_subpartitions
            3     where table_name='TEST_TAB1'
            4     and partition_name='P_20110201'
            5     order by subpartition_name;

          TABLE_NAME            SUBPARTITION_NAME          GLO   LAST_ANALYZED          NUM_ROWS
          --------------        ------------------         ---   -------------------- ----------
          TEST_TAB1             P_20110201_LONDON          YES   06-FEB-2011 19:26:49          1
          TEST_TAB1             P_20110201_MOSCOW          YES   06-FEB-2011 19:26:49          8
          TEST_TAB1             P_20110201_OTHERS          YES   06-FEB-2011 19:26:50          1
          TEST_TAB1             P_20110201_SYDNEY          YES   06-FEB-2011 19:26:50          1

          SQL>    select a.column_name, a.num_distinct,
            2       display_raw(a.low_value,b.data_type) as low_val,
            3       display_raw(a.high_value,b.data_type) as high_val, b.data_type
            4     from user_tab_col_statistics a, user_tab_cols b
            5     where a.table_name='TEST_TAB1'
            6     and   a.table_name=b.table_name
            7     and   a.column_name=b.column_name
            8     order by a.column_name;

          COLUMN_NAME                NUM_DISTINCT LOW_VAL              HIGH_VAL                               DATA_TYPE
          -------------------------- ------------ -------------------- --------------------                   -------------
          REPORTING_DATE                        1 20110201             20110201                               NUMBER
          SEQ_ID                               11 10                   60000                                  NUMBER
          SOURCE_SYSTEM                         4 DALLAS               SYDNEY                                 VARCHAR2
          STATUS                                2 P                    U                                      VARCHAR2

The statistics all look as we might expect so next I’ll insert rows for the new reporting_date 20110202 and gather statistics using
the new GRANULARITY => ‘APPROX GLOBAL AND PARTITION’ for the single P_20110202 partition.

Listing 37 – List table and partition statistics (APPROX_GLOBAL2.SQL)

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 1000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 2000, 'U');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 4000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 60000, 'N');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 100, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 10, 'N');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 300, 'N');



3 March 2012                                                            34                      Doug Burns – Hotsos Symposium 2011
          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 5000, 'U');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 6000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 7000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 8000, 'P');

          1 row created.

          SQL> COMMIT;

          Commit complete.

          SQL> exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1',
                         granularity => 'APPROX_GLOBAL AND PARTITION', partname => 'P_20110202');

          PL/SQL procedure successfully completed.


Looking at the updated statistics below, it would appear that the row counts on the table have been updated accurately and
statistics on the 20110202 partition look correct.

Listing 38 – List table and partition statistics (APPROX_GLOBAL2.SQL)

          SQL> select table_name, global_stats, last_analyzed, num_rows
            2 from user_tables
            3 where table_name='TEST_TAB1';

          TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
          -------------- --- -------------------- ----------
          TEST_TAB1      YES 06-FEB-2011 19:26:50         22

          SQL>    select table_name, partition_name, global_stats, last_analyzed, num_rows
            2     from user_tab_partitions
            3     where table_name='TEST_TAB1'
            4     order by partition_name;

          TABLE_NAME            PARTITION_NAME           GLO     LAST_ANALYZED          NUM_ROWS
          --------------        ----------------         ---     -------------------- ----------
          TEST_TAB1             P_20110131               YES     06-FEB-2011 19:26:50          0
          TEST_TAB1             P_20110201               YES     06-FEB-2011 19:26:50         11
          TEST_TAB1             P_20110202               YES     06-FEB-2011 19:26:50         11
          TEST_TAB1             P_20110203               YES     06-FEB-2011 19:26:50          0
          TEST_TAB1             P_20110204               YES     06-FEB-2011 19:26:50          0
          TEST_TAB1             P_20110205               YES     06-FEB-2011 19:26:50          0
          TEST_TAB1             P_20110206               YES     06-FEB-2011 19:26:50          0
          TEST_TAB1             P_20110207               YES     06-FEB-2011 19:26:50          0
          TEST_TAB1             P_20110209               YES     06-FEB-2011 19:26:50          0

          9 rows selected.


However, as might be expected from the parameter value passed, subpartition statistics have not been gathered, just as they
would not have been with GRANULARITY=>’GLOBAL AND PARTITION’ used to gather partition statistics and true global
statistics.

Listing 39 – List subpartition statistics (APPROX_GLOBAL2.SQL)




3 March 2012                                                              35                       Doug Burns – Hotsos Symposium 2011
          SQL>   select table_name, subpartition_name, global_stats, last_analyzed, num_rows
            2    from user_tab_subpartitions
            3    where table_name='TEST_TAB1'
            4    and partition_name='P_20110202'
            5    order by subpartition_name;

          TABLE_NAME           SUBPARTITION_NAME         GLO   LAST_ANALYZED          NUM_ROWS
          --------------       ------------------        ---   -------------------- ----------
          TEST_TAB1            P_20110202_LONDON         YES   06-FEB-2011 19:26:50          0
          TEST_TAB1            P_20110202_MOSCOW         YES   06-FEB-2011 19:26:50          0
          TEST_TAB1            P_20110202_OTHERS         YES   06-FEB-2011 19:26:50          0
          TEST_TAB1            P_20110202_SYDNEY         YES   06-FEB-2011 19:26:50          0


The statistics on the subpartitions are those gathered by the initial GRANULARITY=>’ALL’ default and show zero rows. If we are
working with composite partitioned objects and have decided that subpartition statistics are valuable so are gathering them or,
in this case, Oracle has gathered them via GRANULARITY => ‘ALL’ then we need to make additional calls to DBMS_STATS with
GRANULARITY => ‘SUBPARTITION’ for each of the underlying subpartitions we’ve just loaded with data. This seems like an
acceptable overhead to maintain the global statistics at the higher levels of the object. As always, let’s check the column
statistics to verify the quality of those statistics.

Listing 40 – List approximate global column statistics (APPROX_GLOBAL2.SQL)

          SQL>   select a.column_name, a.num_distinct,
            2      display_raw(a.low_value,b.data_type) as low_val,
            3      display_raw(a.high_value,b.data_type) as high_val, b.data_type
            4    from user_tab_col_statistics a, user_tab_cols b
            5    where a.table_name='TEST_TAB1'
            6    and   a.table_name=b.table_name
            7    and   a.column_name=b.column_name
            8    order by a.column_name;

          COLUMN_NAME                NUM_DISTINCT LOW_VAL              HIGH_VAL                            DATA_TYPE
          -------------------------- ------------ -------------------- --------------------                -------------
          REPORTING_DATE                        2 20110201             20110202                            NUMBER
          SEQ_ID                               11 10                   60000                               NUMBER
          SOURCE_SYSTEM                         4 CHICAGO              SYDNEY                              VARCHAR2
          STATUS                                2 N                    U                                   VARCHAR2

Although the column High and Low values have been updated correctly, the NDVs are incorrect again. SOURCE_SYSTEM should
show NUM_DISTINCT of 5 and STATUS has three distinct values. These Approximate Global Statistics suffer precisely the same
issue as those generated by the aggregation process. Row counts and High/Low values are correct but Oracle simply can’t
derive NDVs accurately by looking at the NDV of a single partition.


APPROX_GLOBAL AND PARTITION SUMMARY

What, then, does APPROX_GLOBAL AND PARTITION actually deliver? Very little on the face of it although I suppose it works as a
better combination with updating stale global statistics occasionally because there are unlikely to be the aggregation issues
when combining true global statistics with aggregated statistics, as highlighted earlier. It should be clear that APPROX GLOBAL
AND PARTITION used in isolation does not solve the challenge of accurate global statistics.

11G IMPROVEMENTS

INCREMENTAL STATISTICS




3 March 2012                                                              36                 Doug Burns – Hotsos Symposium 2011
Recognising that maintaining Global Statistics is the most formidable challenge for those managing large partitioned objects,
Oracle has implemented a more efficient solution in Oracle 11g – Incremental Statistics. The difficulty in maintaining accurate
global statistics is that the default process is

      i.         Perform full table scan to gather global statistics.
     ii.         Gather statistics on modified partitions.

As tables grow to hundreds of millions or billions of rows, i. becomes so expensive as to be unworkable and, as we have seen,
the aggregation process to generate Approximate Global Statistics yields very inaccurate NDVs. Oracle 11g resolves these issues
by storing synopses in the data dictionary describing the distinct column values contained in the various levels of the table (e.g.
Table and Partition). Synposes can be merged to generate aggregated synposes at the higher levels of the object so that the
process for regathering statistics for new partitions becomes a much more efficient

       i.        Gather statistics and create synopses for the changed partitions only
      ii.        Oracle automatically merges partition level synopses into a global synopsis
     iii.        The global statistics are automatically derived from the partition level statistics and global synopses

Storing and processing of synopses is controlled via the INCREMENTAL table preference. This is set to ‘FALSE’ by default so the
database uses the historic approach of performing a full table scan to maintain global statistics. To use incremental statistics, all
of the following have to be true.

            1.   The INCREMENTAL value for the partitioned table is TRUE.
            2.   The PUBLISH value for the partitioned table is the default TRUE.
            3.   The user specifies AUTO_SAMPLE_SIZE for ESTIMATE_PERCENT and AUTO for GRANULARITY when gathering statistics
                 on the table. (Note that these are the default values.)

Requirement 3 is likely to strike fear into the experienced Oracle heart because Oracle’s earlier attempts at AUTO_SAMPLE_SIZE
                                                                                                                              11
have typically been time-consuming, resource-intensive and difficult to predict. However, 11g uses one-pass distinct sampling
that dramatically improves the sampling process so that much higher percentages can be sampled at a much lower cost. It has
become sensible to leave ESTIMATE_PERCENT at its default setting of AUTO_SAMPLE_SIZE unless faced with extreme
performance problems gathering statistics.

First I will set the INCREMENTAL table preference for TEST_TAB1, repopulate with some intial data and gather statistics using
default values for all other options.

Listing 41 – Modify TEST_TAB1 to use incremental statistics, repopulate amd gather statistics (INCREMENTAL.SQL)

                 SQL> TRUNCATE TABLE TEST_TAB1;

                 Table truncated.

                 SQL> exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

                 PL/SQL procedure successfully completed.

                 SQL> exec dbms_stats.set_table_prefs('TESTUSER','TEST_TAB1','INCREMENTAL','TRUE')

                 PL/SQL procedure successfully completed.

                 SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 1000, 'P');




11
     This process is described in great detail in Amit Poddar’s excellent paper on One pass distinct sampling (see Appendix A)



3 March 2012                                                               37                              Doug Burns – Hotsos Symposium 2011
       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 2000, 'U');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 4000, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 60000, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 100, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 10, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 300, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'LONDON', 5000, 'U');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'DALLAS', 6000, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'SYDNEY', 7000, 'P');

       1 row created.

       SQL> INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 8000, 'P');

       1 row created.

       SQL> COMMIT;

       Commit complete.

       SQL> exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1');

       PL/SQL procedure successfully completed.

       SQL> select table_name, global_stats, last_analyzed, num_rows
         2 from user_tables
         3 where table_name='TEST_TAB1';

       TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
       -------------- --- -------------------- ----------
       TEST_TAB1      YES 13-FEB-2011 11:31:06         11

       SQL>    select a.column_name, a.num_distinct,
         2       display_raw(a.low_value,b.data_type) as low_val,
         3       display_raw(a.high_value,b.data_type) as high_val, b.data_type
         4     from user_tab_col_statistics a, user_tab_cols b
         5     where a.table_name='TEST_TAB1'
         6     and   a.table_name=b.table_name
         7     and   a.column_name=b.column_name
         8     order by a.column_name;

       COLUMN_NAME                NUM_DISTINCT LOW_VAL              HIGH_VAL                DATA_TYPE
       -------------------------- ------------ -------------------- --------------------    -------------------
       REPORTING_DATE                        1 20110201             20110201                NUMBER
       SEQ_ID                               11 10                   60000                   NUMBER



3 March 2012                                           38                     Doug Burns – Hotsos Symposium 2011
          SOURCE_SYSTEM                                           4 DALLAS                          SYDNEY                 VARCHAR2
          STATUS                                                  2 P                               U                      VARCHAR2

As expected, the statistics are global and the column statistics look reasonable but the real challenge arrives as we start to add
data to the P_20110202 partition.

Listing 42 – Add 20110202 data, regather statistics and list Table-level statistics (INCREMENTAL.SQL)

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 1000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 2000, 'U');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 4000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 60000, 'N');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 100, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 10, 'N');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 300, 'N');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 5000, 'U');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 6000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 7000, 'P');

          1 row created.

          SQL> INSERT INTO TEST_TAB1 VALUES (20110202, 'CHICAGO', 8000, 'P');

          1 row created.

          SQL> COMMIT;

          Commit complete.

          SQL> exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1');

          PL/SQL procedure successfully completed.

          SQL>    select table_name, global_stats, last_analyzed, num_rows
            2     from dba_tables
            3     where table_name='TEST_TAB1'
            4     and owner='TESTUSER'
            5     order by 1, 2, 4 desc nulls last;

          TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
          -------------- --- -------------------- ----------
          TEST_TAB1      YES 13-FEB-2011 11:31:07         22




3 March 2012                                                                 39                              Doug Burns – Hotsos Symposium 2011
         SQL>   select a.column_name, a.num_distinct,
           2      display_raw(a.low_value,b.data_type) as low_val,
           3      display_raw(a.high_value,b.data_type) as high_val, b.data_type
           4    from user_tab_col_statistics a, user_tab_cols b
           5    where a.table_name='TEST_TAB1'
           6    and   a.table_name=b.table_name
           7    and   a.column_name=b.column_name
           8    order by a.column_name;

         COLUMN_NAME                NUM_DISTINCT LOW_VAL              HIGH_VAL                                DATA_TYPE
         -------------------------- ------------ -------------------- --------------------                    -------------------
         -
         REPORTING_DATE                        2 20110201             20110202                                NUMBER
         SEQ_ID                               11 10                   60000                                   NUMBER
         SOURCE_SYSTEM                         5 CHICAGO              SYDNEY                                  VARCHAR2
         STATUS                                3 N                    U                                       VARCHAR2

Pleasingly, the NDVs finally look accurate and repeating the process for further partitions illustrates that all of the Global
                                                             12
Statistics are being maintained accurately at a very low cost .




12
  For a nice illustration of just how effectively costs are reduced, see Greg Rahn’s blog post on Incremental Statistics at
http://structureddata.org/2008/07/16/oracle-11g-incremental-global-statistics-on-partitioned-tables/




3 March 2012                                                     40                          Doug Burns – Hotsos Symposium 2011
Listing 43 – Global Statistics with P20110203 and P_20110204 data versus true contents (INCREMENTAL.SQL)

          SQL> select table_name, global_stats, last_analyzed, num_rows
            2 from user_tables
            3 where table_name='TEST_TAB1';

          TABLE_NAME     GLO LAST_ANALYZED          NUM_ROWS
          -------------- --- -------------------- ----------
          TEST_TAB1      YES 13-FEB-2011 11:31:08         44

          SQL>   select a.column_name, a.num_distinct,
            2      display_raw(a.low_value,b.data_type) as low_val,
            3      display_raw(a.high_value,b.data_type) as high_val, b.data_type
            4    from user_tab_col_statistics a, user_tab_cols b
            5    where a.table_name='TEST_TAB1'
            6    and   a.table_name=b.table_name
            7    and   a.column_name=b.column_name
            8    order by a.column_name;

          COLUMN_NAME                NUM_DISTINCT LOW_VAL              HIGH_VAL                                          DATA_TYPE
          -------------------------- ------------ -------------------- --------------------                              -------------------
          -
          REPORTING_DATE                        4 20110201             20110204                                          NUMBER
          SEQ_ID                               14 1                    80000                                             NUMBER
          SOURCE_SYSTEM                         5 CHICAGO              SYDNEY                                            VARCHAR2
          STATUS                                4 N                    Z                                                 VARCHAR2

          SQL> select reporting_date, count(*) from test_tab1 group by reporting_date;

          REPORTING_DATE   COUNT(*)
          -------------- ----------
                20110201         11
                20110202         11
                20110203         11
                20110204         11

          4 rows selected.

          SQL> select source_system, count(*) from test_tab1 group by source_system;

          SOURCE_SYSTEM                    COUNT(*)
          ------------------------------ ----------
          MOSCOW                                  8
          LONDON                                  1
          SYDNEY                                  1
          CHICAGO                                22
          DALLAS                                 12

          5 rows selected.

          SQL> select status, count(*) from test_tab1 group by status;

          S   COUNT(*)
          - ----------
          U          8
          P         27
          N          7
          Z          2

          4 rows selected.

          SQL> select seq_id, count(*) from test_tab1 group by seq_id;

              SEQ_ID   COUNT(*)
          ---------- ----------
                 100          4
                1000          4
                4000          4
                5000          4
                   1          1
                 400          1



3 March 2012                                                             41                                Doug Burns – Hotsos Symposium 2011
               60000            3
                 300            4
                2000            4
               80000            1
                8000            3
                6000            4
                  10            4
                7000            3

        14 rows selected.

The combination of Incremental Statistics and One-pass Distinct Sampling appears to deliver on the promise of a solution to
low-cost global statistics maintenance for large partitioned objects that receive regular updates. However, there are a number
of issues to be aware of before using Incremental Statistics.


INCREMENTAL STATISTICS BUGS, ISSUES AND USAGE

SYSAUX tablespace usage

The documentation states clearly that the synopsis information will consume space in the SYSAUX tablespace but this could
mean very large data volumes. For example, bug 8411774 was filed for a customer who was concerned that 6GB of their 12GB
                                                     13
SYSAUX Tablespace was being consumed by synopses . These are stored in two main tables - WRI$_OPTSTAT_SYNOPSIS_HEAD$
and WRI$_OPTSTAT_SYNOPSIS$. Oracle’s response to the bug is ‘This is not a bug in case INCREMENTAL is set to TRUE on
partitioned tables. It works as designed’.

We can investigate the contents of these tables to see what is being stored using a query courtesy of Robin Moffat and his blog
post describing a significant Incremental Statistics performance issue at http://rnm1978.wordpress.com/2010/12/31/data-
warehousing-and-statistics-in-oracle-11g-incremental-global-statistics/




13
  Note that this is not the same as the issue of optimiser statistics history consuming space in the SYSAUX tablespace, which can
be resolved by reducing the default statistics history retention period of 31 days.




3 March 2012                                                   42                         Doug Burns – Hotsos Symposium 2011
Listing 44 – Viewing Synopses

          SQL> SELECT u.NAME                                     "owner",
                 o.NAME                                          "table_name",
                 p.subname                                       "partition_name",
                 c.NAME                                          "column_name",
                 to_char(h.analyzetime, 'YYYY-MM-DD-HH24:MI:SS') "analyse_Time",
                 COUNT(*)                                        "hash entries"
          FROM   sys.wri$_optstat_synopsis_head$ h left join sys.wri$_optstat_synopsis$ s
                                        ON h.synopsis# = s.synopsis#
                 left join sys.obj$ o   ON h.bo# = o.obj#
                 left join sys.user$    ON o.owner# = u.user#
                 left join sys.col$     ON h.bo# = c.obj# AND h.intcol# = c.intcol#
                 left join (SELECT bo#, obj# FROM   sys.tabpart$
                            UNION ALL
                            SELECT bo#, obj# FROM   sys.tabcompart$) tp
                                        ON h.bo# = tp.bo# AND h.group# = tp.obj# * 2
                 left join sys.obj$ p   ON tp.obj# = p.obj#
          GROUP BY u.NAME, o.NAME, p.subname, c.NAME, h.analyzetime
          ORDER BY u.NAME, o.NAME, c.NAME;

          owner           table_name       partition_name     column_name        analyse_Time        hash entries
          ----------      --------------   ----------------   ----------------   ------------------- ------------
          TESTUSER        TEST_TAB1        P_20110131         REPORTING_DATE                                    1
          TESTUSER        TEST_TAB1        P_20110201         REPORTING_DATE     2011-02-13-13:37:27            1
          TESTUSER        TEST_TAB1        P_20110202         REPORTING_DATE     2011-02-13-13:37:29            1
          TESTUSER        TEST_TAB1        P_20110203         REPORTING_DATE     2011-02-13-13:37:52            1
          TESTUSER        TEST_TAB1        P_20110204         REPORTING_DATE     2011-02-13-13:38:54            1
          TESTUSER        TEST_TAB1        P_20110205         REPORTING_DATE                                    1
          TESTUSER        TEST_TAB1        P_20110206         REPORTING_DATE                                    1
          TESTUSER        TEST_TAB1        P_20110207         REPORTING_DATE                                    1
          TESTUSER        TEST_TAB1        P_20110211         REPORTING_DATE                                    1
          TESTUSER        TEST_TAB1        P_20110212         REPORTING_DATE                                    1
          TESTUSER        TEST_TAB1        P_20110131         SEQ_ID                                            1
          TESTUSER        TEST_TAB1        P_20110201         SEQ_ID             2011-02-13-13:37:27           11
          TESTUSER        TEST_TAB1        P_20110202         SEQ_ID             2011-02-13-13:37:29           11
          TESTUSER        TEST_TAB1        P_20110203         SEQ_ID             2011-02-13-13:37:52           11
          TESTUSER        TEST_TAB1        P_20110204         SEQ_ID             2011-02-13-13:38:54           11
          TESTUSER        TEST_TAB1        P_20110205         SEQ_ID                                            1
          TESTUSER        TEST_TAB1        P_20110206         SEQ_ID                                            1
          TESTUSER        TEST_TAB1        P_20110207         SEQ_ID                                            1
          TESTUSER        TEST_TAB1        P_20110211         SEQ_ID                                            1
          TESTUSER        TEST_TAB1        P_20110212         SEQ_ID                                            1
          TESTUSER        TEST_TAB1        P_20110131         SOURCE_SYSTEM                                     1
          TESTUSER        TEST_TAB1        P_20110201         SOURCE_SYSTEM      2011-02-13-13:37:27            4
          TESTUSER        TEST_TAB1        P_20110202         SOURCE_SYSTEM      2011-02-13-13:37:29            1
          TESTUSER        TEST_TAB1        P_20110203         SOURCE_SYSTEM      2011-02-13-13:37:52            1
          TESTUSER        TEST_TAB1        P_20110204         SOURCE_SYSTEM      2011-02-13-13:38:54            1
          TESTUSER        TEST_TAB1        P_20110205         SOURCE_SYSTEM                                     1
          TESTUSER        TEST_TAB1        P_20110206         SOURCE_SYSTEM                                     1
          TESTUSER        TEST_TAB1        P_20110207         SOURCE_SYSTEM                                     1
          TESTUSER        TEST_TAB1        P_20110211         SOURCE_SYSTEM                                     1
          TESTUSER        TEST_TAB1        P_20110212         SOURCE_SYSTEM                                     1
          TESTUSER        TEST_TAB1        P_20110131         STATUS                                            1
          TESTUSER        TEST_TAB1        P_20110201         STATUS             2011-02-13-13:37:27            2
          TESTUSER        TEST_TAB1        P_20110202         STATUS             2011-02-13-13:37:29            3
          TESTUSER        TEST_TAB1        P_20110203         STATUS             2011-02-13-13:37:52            3
          TESTUSER        TEST_TAB1        P_20110204         STATUS             2011-02-13-13:38:54            4
          TESTUSER        TEST_TAB1        P_20110205         STATUS                                            1
          TESTUSER        TEST_TAB1        P_20110206         STATUS                                            1
          TESTUSER        TEST_TAB1        P_20110207         STATUS                                            1
          TESTUSER        TEST_TAB1        P_20110211         STATUS                                            1
          TESTUSER        TEST_TAB1        P_20110212         STATUS                                            1
                                                                                 2011-02-13-12:10:01            7

          41 rows selected.

As Robin says “Looking at the data on WRI$_OPTSTAT_SYNOPSIS$, it can be determined that for every table, partition, and
column, there is an entry on the header table WRI$_OPTSTAT_SYNOPSIS_HEAD$. There is a unique synopsis number given to



3 March 2012                                                    43                       Doug Burns – Hotsos Symposium 2011
each combination, which has one or many entries on the synopsis hash table WRI$_OPTSTAT_SYNOPSIS$. There seems to be one
hash for every distinct value in the table/partition/column”



Armed with this knowedge, you should be able to calculate the synopsis storage requirements based on the number of tables,
partitions and columns you want to gather Incremental Statistics for and how many distinct values are likely to be contained in
your data. I’d argue that the space requirements are acceptable for the significant benefits on what is already a large database,
were it not for the next issue.

Performance issues with large synposes

As Robin describes in his post, the main issue is the impact that large synopsis tables have on the performance of statistics
collection. He shows an example of a very small table (a couple of partitions and a couple of rows) that should take less than a
second to gather statistics on but rises to over three minutes when executed on a database that already has a large number of
rows in the synopsis tables for other, unrelated tables. Most of the three minutes is spent on the following delete statement.

         DELETE /*+ use_nl(h s) leading(h s)index(s i_wri$_optstat_synopsis) */
         FROM WRI$_OPTSTAT_SYNOPSIS$ S
         WHERE S.SYNOPSIS# IN (SELECT /*+ unnest */ H.SYNOPSIS#
                               FROM SYS.WRI$_OPTSTAT_SYNOPSIS_HEAD$ H
                               WHERE H.BO# = :B2 AND H.GROUP# = :B1 * 2 )

As Greg Rahn points out in his comments on the post, this is likely to be a bug in Oracle 11.1.0.7 and although the issue still
exists in 11.2.0.1 I believe from discussion with Robin that it could be a result of the WRI$OPTSTAT_SYNPOSIS$ table not being
partitioned, which it is in 11.2.0.2. I have not tested whether this will resolve the performance problem.

Partition Exchange

As discussed earlier, there are a couple of options when using partition exchange to load data.

    1.   Gather statistics on the load table before partition exchange

    2.   Gather statistics on the partition after partition exchange

When using Incremental Statistics, you must use approach 2 because Oracle can’t use the values in the load table to merge
synopses to update the synposes at higher levels.

Global Index Statistics

Although Incremental Statistics means that we no longer need to perform a full table scan to update global statistics as Oracle is
able to merge the partition-level synposes to update the table-level synopses, Index statistics are handled differently. Oracle
will perform a full scan of the index in order to maintain the global statistics.




INCREMENTAL STATISTICS SUMMARY




3 March 2012                                                    44                         Doug Burns – Hotsos Symposium 2011
I feel confident saying that the introduction of Incremental Statistics alongside One-pass Distinct Sampling is the solution to
maintaining statistics on large partitioned objects that users have been looking for. Once any lingering performance issues are
ironed out I expect it to become the recognized default approach for partitioned tables.

Positives

        Very low resource usage and elapsed time for maintaining global statistics.

        Almost accurate NDVs.

Negatives

        Requirement to modify any existing gather-and-exchange statistics collection code to exchange-then-gather.

        Implementing for existing large tables will require an initial gather of global statistics that involves a full table scan and
         synopsis generation. This is a price worth paying but needs to be planned.




CONCLUSIONS

Maintaining accurate statistics on partitioned objects is a significant challenge as many designers, developers and architects of
large Oracle databases would testify. The fundamental trade-off is between time and resources spent collecting statistics and
the quality of the results. Statistics quality has a profound impact on the cost calculations of the optimiser so poor statistics
often lead to bad execution plans and extremely poor performance.

The variety of improvements Oracle has developed to address the trade-off complicate the picture and without a clear strategy
it is extremely easy to create new problems while trying to address existing ones. The best strategy for one system could be
completely different to another so this section offers guidelines and directions rather than prescriptive solutions.

Summary of Issues

        The Number of Distinct Values calculated by the aggregation process is very poor and NDVs are a key input to the CBO
         calculations. Only properly gathered global statistics are likely to result in consistently good execution plans.

        DBMS_STATS will only aggregate statistics when all of the underlying components have statistics. If there are any
         missing statistics, the existing aggregated statistics will also go missing, leading to dynamic sampling.

        DBMS_STATS will never overwrite gathered global statistics with aggregated global statistics unless you are using
         ‘APPROX_GLOBAL AND PARTITION’. You must maintain any true global statistics that you have gathered or they will
         become stale rapidly.

        The default setting of _minimal_stats_aggregation means that, unless you explicitly gather statistics on the partitions
         you've just exchanged, aggregation will not take place. Using aggregation and a gather-then-exchange approach
         requires you to use a non-default _minimal_stats_aggregation setting that is something you must discuss with Oracle
         Support first.

        You must use an exchange-then-gather approach if you are using Incremental Statistics because Oracle won’t maintain
         synopses on the load table.




3 March 2012                                                      45                           Doug Burns – Hotsos Symposium 2011
       There are a number of COPY_TABLE_STATS bugs that are fixed in later releases and one or two that remain unfixed,
        particularly for composite or list partitioned tables.

       APPROX_GLOBAL statistics still suffer from poor aggregated NDVs.

Guidelines and tips

       Checking that statistics have been gathered correctly includes looking at the detailed results. It is far too easy to look at
        NUM_ROWS and LAST_ANALYZED and satisfy yourself that statistics are in place. To check the quality of those
        statistics, it’s essential to look at High and Low Values and NDVs and at all levels of the partitioned object.

       USER_TAB_STATS_HISTORY can be useful in determining which levels statistics were gathered on and when.

       Pay particular attention to the GLOBAL_STATS column that indicates whether the statistics are true or aggregated
        global statistics.

       DBMS_STATS is constantly evolving. It is therefore important to be on the latest versions of Oracle where possible and
        to be aware of the various patches that should be applied where not.

       If you have developed your own wrapper code for your statistics gathering processes you should always gather
        statistics using the same code. Manual statistics gathering exercises are likely to introduce inappropriate DBMS_STATS
        parameter settings due to human error.

       Once you have set your statistics appropriately you should lock them to prevent them being regathered in error in
        future either by manual intervention or auto stats gather jobs. Your statistics management code can unlock any
        statistics that need to change subsequently.

       Because bad or missing statistics can cause such serious performance problems, it may be worth implementing
        scheduled jobs to check that statistics are being gathered correctly according to your overall approach and are in place.

       Statistics collection takes time and resources but setting the DEGREE parameter to use parallel execution can make run
        times more bearable on modern servers with spare capacity.

Recommendations

       Use Oracle’s default settings wherever possible. Despite my own bad experiences over the years with default settings
        causing performance problems, I think there is too strong a tendency to implement specific site or individual solutions
        based on past experiences. DBMS_STATS is constantly evolving. However, if default settings do cause performance
        issues, I’d have no hesitation in filing a Service Request and then implementing an alternative.

       Decide on an overall statistics collection strategy, develop any required PL/SQL code and scheduled jobs and stick to it.
        Redesign it later if necessary but don’t chop and change between strategies on a whim or through emergency manual
        intervention. The root cause of most of the worst cases that I’ve seen of performance problems caused by missing or
        out-of-date statistics has been a combination of approaches caused by those unfamiliar with a system. For example,
        manually gathering statistics on an object using DBMS_STATS options that are incompatible with the regular statistics
        collection approach or inadvertently having both the auto stats gather and site-specific processes enabled.

       Collecting accurate global statistics can be an expensive proposition but is essential to achieve optimal execution plans.




3 March 2012                                                    46                           Doug Burns – Hotsos Symposium 2011
              o   If using Oracle 11g you should definitely try Incremental Statistics although ensure that you have all of the
                  relevant patches applied and the best solution is likely to be the 11.2.0.2 patchset.

              o   If using Oracle 10g or below then you should gather at all levels with GRANULARITY=> ‘AUTO’ if at all possible

              o   If that is not possible due to time or resource constraints, a combination of slightly stale Global statistics and
                  up to date Partition statistics is probably preferable to Aggregated Global statistics which implies an approach
                  of gathering statistics as new partitions appear (GRANULARITY=>’PARTITION’ with specific PARTNAME) along
                  with an occasional gather at GRANULARITY=>’GLOBAL’ level to refresh the global statistics.

              o   As the poorest option, use the aggregation process in the knowledge that the global NDVs will be inaccurate
                  and that it’s very easy to misuse the aggregation process.

        Picking the right sampling percentage

              o   If time allows, use AUTO_SAMPLE_SIZE for ESTIMATE_PERCENT in 11g because it will deliver very high quality
                  statistics at a reasonably low cost using one-pass distinct sampling.

              o   Use AUTO_SAMPLE_SIZE where time allows on previous versions.

              o   Use lower ESTIMATE_PERCENT values if essential to get some statistics on an object but be aware that as you
                  reduce the sampling percentage, the quality of the statistics will degrade.

        Don’t use Dynamic Sampling as the core component of your statistics collection strategy.

        Maintaining statistics on Composite Partitioned objects is slightly trickier and DBMS_STATS support can be a little
         patchy. I wouldn’t suggest designing your physical data model based on this but consider whether subpartitions are
         delivering any benefit and whether the statistics on them are actually being used.

        Although part of me can understand the resistance to applying new patch-sets because of the additional testing,
         change management and outages required, I think it's asking for trouble to be on older patch-sets if you're using newer
         features such as COPY_TABLE_STATS or Incremental Statistics. For example, as far as I'm aware COPY_TABLE_STATS is
         not truly supported in 10.2.0.3 but I've seen a number of people using it.

Developing an effective statistics maintenance strategy can be a very time-consuming process that requires careful design and
testing across large data volumes and a variety of queries. Working within modern development cycles it can be difficult to
justify the time required without obvious business functionality being delivered but I would argue that nothing is as likely to
deliver the businesses performance expectations across a wide range of queries, including those yet to be developed that will
require less manual tuning effort. Fortunately Oracle is starting to introduce features that make the technical challenges less
formidable.

ACKNOWLEDGEMENTS

I’d like to thank the following individuals

        Fellow travelers on the long and winding road to stats nirvana – Jari Kuhanen, Vadim Gotovsky and Steven Holder (all
         UBS), Randolf Geist, Martin Widlake and Jonathan Lewis.
        Robin Moffat for agreeing to let me share the information and queries from his Incremental Statistics performance
         blog post (see References).
        Greg Rahn (Oracle), for the handy display_raw function and consistently good stats-related posts.


3 March 2012                                                     47                          Doug Burns – Hotsos Symposium 2011
       Maria Colgan (Oracle), for sanity-checking several of the blog posts that spawned the idea for this paper.


ABOUT THE AUTHOR

Doug Burns is an independent consultant who has over 20 years experience working with Oracle in a range of industries and
applications and has worked as a course instructor and technical editor for both Oracle UK and Learning Tree International. He
can be contacted at dougburns@yahoo.com and this document, other articles and his blog can be found at
http://oracledoug.com. He is a member of the Oak Table Network and an Oracle ACE Director.




3 March 2012                                                  48                          Doug Burns – Hotsos Symposium 2011
APPENDIX A - ADDITIONAL RESOURCES




DOCUMENTATION AND SUPPORT NOTES

Managing Optimizer Statistics (11.2)
http://download.oracle.com/docs/cd/E11882_01/server.112/e16638/stats.htm



There are also a number of useful resources on My Oracle Support
        236935.1 – Global Statistics – An Explanation
        44961.1 - Statistics Gathering; Frequency and Strategy



BLOGS

Some of the best information about DBMS_STATS and optimiser behaviour can be found in blog postings because these are
likely to delve into specific narrow issues in more detail.

http://blogs.oracle.com/optimizer - The home of the Optimizer Development Group so posts all tend to be focused on statistics
or optimiser behaviour and the quality is uniformly excellent.

http://structureddata.org – The home of Greg Rahn, a performance engineer working in Oracle’s Real World Performance
Group. Although the range of subjects is wider than the optimiser, there are many useful related posts there, including the
following post that includes the DISPLAY_RAW function I have used throughout the examples.

         http://structureddata.org/2008/03/26/choosing-an-optimal-stats-gathering-strategy/

http://oracledoug.com/serendipity - The author’s blog, including the statistics-related posts that originally prompted this paper.

Individual Posts

         Performance issue with missing Global Statistics
         http://mwidlake.wordpress.com/2010/02/16/stats-need-stats-to-gather-stats/
         http://optimizermagic.blogspot.com/2009/02/maintaining-statistics-on-large.html
         http://blogs.oracle.com/optimizer/2010/08/dynamic_sampling_and_its_impact_on_the_optimizer.html
         http://askdba.org/weblog/2009/06/dbms_stats-copy_table_stats-does-not-alter-lowhigh-value/
         http://antognini.ch/2008/10/granularity-approx_global-and-partition/
         http://martincarstenbach.wordpress.com/2010/06/07/incrementally-gathering-statistics-in-10g/
         http://optimizermagic.blogspot.com/2009/08/understanding-dbmsstatssetprefs.html
         http://blogs.oracle.com/optimizer/2011/01/gathering_optimizer_statistics_is_one.html




3 March 2012                                                   49                          Doug Burns – Hotsos Symposium 2011
WHITE PAPERS

One Pass Distinct Sampling by Amit Poddar
http://www.oraclegeek.net/downloads/OnePassDistinctSampling.pdf

Upgrading from Oracle Database 10g to 11g: What to expect from the optimizer by Maria Colgan
http://www.oracle.com/technetwork/database/features/bi-datawarehousing/twp-upgrading-10g-to-11g-what-to-ex-
133707.pdf


BUGS AND PATCHES

        Patch 8866627: MERGE LABEL REQUEST ON TOP OF 10.2.0.4.0 FOR BUGS 8318020 7475493

                This is the most important patch to apply if you are using any of the newer features. It contains fixes for
                around a dozen bugs and I recommend you apply this unless you are already running 10.2.0.5 or 11.2.0.x
                7475493, 7116357, 7463138, 6718212, 7441785, 6526370, 8318020

        Bug 8719831 - Incomplete statistics on partition level using COPY_TABLE_STATS – fixed in 11.2.0.2
                Incorrect density on columns after copying stats

        Bug 8318020 - Slow DBMS_STATS / Inaccurate or Missing Statistics - fixed in MLR 8866627, 10.2.0.5 and 11.2.0.1
                A range of issues, some apparently unrelated

        Bug 10234419 - Extend dbms_stats.copy_table_stats to all range partitioning key columns – fixed in 12.1
                Copy stats does not adjust values for partition key columns other than the first column




3 March 2012                                                  50                         Doug Burns – Hotsos Symposium 2011
APPENDIX B – LISTINGS




This appendix contains the full versions of the scripts executed to test the various statistics collection features. The listings in
the main body of the paper use small excerpts of output when these scripts were executed.


INIT.SQL
@@defs -– Contains default column width settings etc
spool init.txt

PROMPT ****************************************************
PROMPT First build the table and index and INSERT some data
PROMPT ****************************************************

DROP TABLE TEST_TAB1;

CREATE TABLE TEST_TAB1
(
  REPORTING_DATE            NUMBER              NOT NULL,
  SOURCE_SYSTEM             VARCHAR2(30 CHAR)   NOT NULL,
  SEQ_ID                    NUMBER              NOT NULL,
  STATUS                    VARCHAR2(1 CHAR)    NOT NULL
)
PARTITION BY RANGE (REPORTING_DATE)
SUBPARTITION BY LIST (SOURCE_SYSTEM)
SUBPARTITION TEMPLATE
  (SUBPARTITION MOSCOW VALUES ('MOSCOW') TABLESPACE TEST_DATA,
   SUBPARTITION LONDON VALUES ('LONDON') TABLESPACE TEST_DATA,
   SUBPARTITION SYDNEY VALUES ('SYDNEY') TABLESPACE TEST_DATA,
   SUBPARTITION OTHERS VALUES (DEFAULT)    TABLESPACE TEST_DATA)
(
  PARTITION P_20110131 VALUES LESS THAN (20110201) NOLOGGING NOCOMPRESS,
  PARTITION P_20110201 VALUES LESS THAN (20110202) NOLOGGING NOCOMPRESS,
  PARTITION P_20110202 VALUES LESS THAN (20110203) NOLOGGING NOCOMPRESS,
  PARTITION P_20110203 VALUES LESS THAN (20110204) NOLOGGING NOCOMPRESS,
  PARTITION P_20110204 VALUES LESS THAN (20110205) NOLOGGING NOCOMPRESS,
  PARTITION P_20110205 VALUES LESS THAN (20110206) NOLOGGING NOCOMPRESS,
  PARTITION P_20110206 VALUES LESS THAN (20110207) NOLOGGING NOCOMPRESS,
  PARTITION P_20110207 VALUES LESS THAN (20110208) NOLOGGING NOCOMPRESS
);

CREATE UNIQUE INDEX TEST_TAB1_IX1 ON TEST_TAB1
(REPORTING_DATE, SOURCE_SYSTEM, SEQ_ID)
  LOCAL NOPARALLEL COMPRESS 1;

INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'MOSCOW',   1000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110202,   'MOSCOW',   30000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110203,   'MOSCOW',   2000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'MOSCOW',   1000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110205,   'MOSCOW',   2400, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'LONDON',   500, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'SYDNEY',   700, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110202,   'SYDNEY',   1200, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'DALLAS',   400, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'DALLAS',   600, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'DALLAS',   700, 'P');

COMMIT;




3 March 2012                                                      51                           Doug Burns – Hotsos Symposium 2011
DEFAULT.SQL
@@defs
@@init

spool default.txt

exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select index_name, last_analyzed, distinct_keys, clustering_factor, global_stats
from user_indexes
where table_name='TEST_TAB1';

drop index test_tab_ix1;

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110201'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110201%'
order by a.column_name, a.subpartition_name;




3 March 2012                                         52                       Doug Burns – Hotsos Symposium 2011
AGGREGATION.SQL
@defs
spool aggregation.txt

TRUNCATE TABLE TEST_TAB1;

exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'MOSCOW',   1000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110202,   'MOSCOW',   30000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110203,   'MOSCOW',   2000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'MOSCOW',   1000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110205,   'MOSCOW',   2400, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'LONDON',   500, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'SYDNEY',   700, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110202,   'SYDNEY',   1200, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'DALLAS',   400, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'DALLAS',   600, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'DALLAS',   700, 'P');

COMMIT;

exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', GRANULARITY => 'SUBPARTITION');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110206'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110206%'
order by a.column_name, a.subpartition_name;



3 March 2012                                                     53           Doug Burns – Hotsos Symposium 2011
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'MOSCOW',   100000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'MOSCOW',   3000000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'MOSCOW',   200000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'MOSCOW',   110000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'MOSCOW',   240000, 'U');

COMMIT;

exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', GRANULARITY => 'SUBPARTITION', PARTNAME =>
'P_20110206_MOSCOW');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110206'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110206'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110206%'
order by a.column_name, a.subpartition_name;

select reporting_date, count(*) from test_tab1 group by reporting_date;

select source_system, count(*) from test_tab1 group by source_system;

select status, count(*) from test_tab1 group by status;

select /*+ gather_plan_statistics */ *
from test_tab1
where status='U';

select * from TABLE(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

INSERT INTO TEST_TAB1 VALUES (20110201, 'MOSCOW', 10000, 'P');



3 March 2012                                                     54            Doug Burns – Hotsos Symposium 2011
INSERT   INTO   TEST_TAB1   VALUES   (20110202,   'MOSCOW', 300000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110203,   'MOSCOW', 20000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'MOSCOW', 10000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110205,   'MOSCOW', 24000, 'U');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'LONDON', 5000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'SYDNEY', 7000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110202,   'SYDNEY', 12000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'CHICAGO', 4000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'CHICAGO', 6000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO', 7000, 'P');

COMMIT;

exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', GRANULARITY => 'SUBPARTITION');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110206'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110206'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110206%'
order by a.column_name, a.subpartition_name;

spool off




3 March 2012                                                   55             Doug Burns – Hotsos Symposium 2011
AGGREGATION2.SQL
@default
@aggregation
spool aggregation2.txt

ALTER TABLE TEST_TAB1
ADD PARTITION P_20110208 VALUES LESS THAN (20110209);

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110208'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110208'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110208%'
order by a.column_name, a.subpartition_name;

-- Simulate one load finishing

INSERT INTO TEST_TAB1 VALUES (20110208, 'DALLAS', 400, 'P');
INSERT INTO TEST_TAB1 VALUES (20110208, 'DALLAS', 600, 'P');
COMMIT;

exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', granularity => 'SUBPARTITION', partname =>
'P_20110208_OTHERS');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select   table_name, partition_name, global_stats, last_analyzed, num_rows



3 March 2012                                          56                      Doug Burns – Hotsos Symposium 2011
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110208'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110208'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110208%'
order by a.column_name, a.subpartition_name;

show parameter dynamic_sampling

select /*+ gather_plan_statistics */ *
from test_tab1
where source_system='DALLAS';

select * from TABLE(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

-- Simulate other loads finishing later

INSERT   INTO   TEST_TAB1   VALUES   (20110208,   'MOSCOW',   1000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110208,   'MOSCOW',   30000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110208,   'MOSCOW',   2000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110208,   'MOSCOW',   10000, 'Z');
INSERT   INTO   TEST_TAB1   VALUES   (20110208,   'MOSCOW',   2400, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110208,   'SYDNEY',   500, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110208,   'SYDNEY',   700, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110208,   'LONDON',   1200, 'P');

exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', granularity => 'SUBPARTITION', partname =>
'P_20110208_LONDON');
exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', granularity => 'SUBPARTITION', partname =>
'P_20110208_SYDNEY');
exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', granularity => 'SUBPARTITION', partname =>
'P_20110208_MOSCOW');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';




3 March 2012                                                     57            Doug Burns – Hotsos Symposium 2011
select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110208'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110208'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110208%'
order by a.column_name, a.subpartition_name;

spool off




3 March 2012                                         58                       Doug Burns – Hotsos Symposium 2011
AGGREGATION3.SQL
@defs
@default
@aggregation
@aggregation2
spool aggregation3.txt

TRUNCATE TABLE TEST_TAB1;

exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'MOSCOW',   1000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110202,   'MOSCOW',   30000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110203,   'MOSCOW',   2000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'MOSCOW',   1000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110205,   'MOSCOW',   2400, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'LONDON',   500, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'SYDNEY',   700, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110202,   'SYDNEY',   1200, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110201,   'DALLAS',   400, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'DALLAS',   600, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'DALLAS',   700, 'P');

COMMIT;

exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1')

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110206'
order by subpartition_name;

PAUSE

INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'MOSCOW',   100000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'MOSCOW',   3000000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'MOSCOW',   200000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'MOSCOW',   110000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110206,   'MOSCOW',   240000, 'U');

COMMIT;

exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', GRANULARITY => 'SUBPARTITION', PARTNAME =>
'P_20110206_MOSCOW');

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110206'
order by subpartition_name;

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'



3 March 2012                                                     59            Doug Burns – Hotsos Symposium 2011
order by partition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110206'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110206%'
order by a.column_name, a.subpartition_name;

spool off




3 March 2012                                         60             Doug Burns – Hotsos Symposium 2011
PARTEX_MSA.SQL
@defs
@init
@default
pause
spool partex_msa.txt

ALTER TABLE TEST_TAB1
ADD PARTITION P_20110209 VALUES LESS THAN (20110210);

DROP TABLE LOAD_TAB1;

CREATE TABLE LOAD_TAB1
AS SELECT * FROM TEST_TAB1 WHERE 1=0;

CREATE UNIQUE INDEX LOAD_TAB1_IX1 ON LOAD_TAB1
(REPORTING_DATE, SOURCE_SYSTEM, SEQ_ID)
NOPARALLEL COMPRESS 1;

-- Note - this is the default parameter value
alter session set "_minimal_stats_aggregation"=TRUE;

INSERT INTO LOAD_TAB1 VALUES (20110209, 'MOSCOW', 400, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'MOSCOW', 600, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'MOSCOW', 900, 'Z');

exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1', granularity => 'GLOBAL', method_opt => 'FOR ALL
COLUMNS SIZE 1', estimate_percent => 1);

ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110209_MOSCOW WITH TABLE load_tab1;
ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20110209_MOSCOW REBUILD UNUSABLE LOCAL INDEXES;

TRUNCATE TABLE LOAD_TAB1;
exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

INSERT INTO LOAD_TAB1 VALUES (20110209, 'LONDON', 400, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'LONDON', 600, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'LONDON', 900, 'Z');

exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1', granularity => 'GLOBAL', method_opt => 'FOR ALL
COLUMNS SIZE 1', estimate_percent => 1);

ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110209_LONDON WITH TABLE load_tab1;
ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20110209_LONDON REBUILD UNUSABLE LOCAL INDEXES;

TRUNCATE TABLE LOAD_TAB1;
exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

INSERT INTO LOAD_TAB1 VALUES (20110209, 'SYDNEY', 400, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'SYDNEY', 600, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'SYDNEY', 900, 'Z');

exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1', granularity => 'GLOBAL', method_opt => 'FOR ALL
COLUMNS SIZE 1', estimate_percent => 1);

ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110209_SYDNEY WITH TABLE load_tab1;
ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20110209_SYDNEY REBUILD UNUSABLE LOCAL INDEXES;

TRUNCATE TABLE LOAD_TAB1;
exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

INSERT INTO LOAD_TAB1 VALUES (20110209, 'CHICAGO', 400, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'CHICAGO', 600, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'CHICAGO', 900, 'Z');

exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1', granularity => 'GLOBAL', method_opt => 'FOR ALL
COLUMNS SIZE 1', estimate_percent => 1);




3 March 2012                                           61                   Doug Burns – Hotsos Symposium 2011
ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110209_OTHERS WITH TABLE load_tab1;
ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20110209_OTHERS REBUILD UNUSABLE LOCAL INDEXES;

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110209'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110209'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110209%'
order by a.column_name, a.subpartition_name;

spool off




3 March 2012                                         62                       Doug Burns – Hotsos Symposium 2011
PARTEX_NON_MSA.SQL
@defs
@init
@default
spool partex_non_msa.txt

ALTER TABLE TEST_TAB1
ADD PARTITION P_20110209 VALUES LESS THAN (20110210);

DROP TABLE LOAD_TAB1;

CREATE TABLE LOAD_TAB1
AS SELECT * FROM TEST_TAB1 WHERE 1=0;

CREATE UNIQUE INDEX LOAD_TAB1_IX1 ON LOAD_TAB1
(REPORTING_DATE, SOURCE_SYSTEM, SEQ_ID)
NOPARALLEL COMPRESS 1;

-- Note - this is a non-default parameter value
alter session set "_minimal_stats_aggregation"=FALSE;

INSERT INTO LOAD_TAB1 VALUES (20110209, 'MOSCOW', 400, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'MOSCOW', 600, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'MOSCOW', 900, 'Z');

exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1', granularity => 'GLOBAL', method_opt => 'FOR ALL
COLUMNS SIZE 1', estimate_percent => 1);

ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110209_MOSCOW WITH TABLE load_tab1;
ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20110209_MOSCOW REBUILD UNUSABLE LOCAL INDEXES;

TRUNCATE TABLE LOAD_TAB1;
exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

INSERT INTO LOAD_TAB1 VALUES (20110209, 'LONDON', 400, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'LONDON', 600, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'LONDON', 900, 'Z');

exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1', granularity => 'GLOBAL', method_opt => 'FOR ALL
COLUMNS SIZE 1', estimate_percent => 1);

ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110209_LONDON WITH TABLE load_tab1;
ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20110209_LONDON REBUILD UNUSABLE LOCAL INDEXES;

TRUNCATE TABLE LOAD_TAB1;
exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

INSERT INTO LOAD_TAB1 VALUES (20110209, 'SYDNEY', 400, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'SYDNEY', 600, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'SYDNEY', 900, 'Z');

exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1', granularity => 'GLOBAL', method_opt => 'FOR ALL
COLUMNS SIZE 1', estimate_percent => 1);

ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110209_SYDNEY WITH TABLE load_tab1;
ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20110209_SYDNEY REBUILD UNUSABLE LOCAL INDEXES;

TRUNCATE TABLE LOAD_TAB1;
exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

INSERT INTO LOAD_TAB1 VALUES (20110209, 'CHICAGO', 400, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'CHICAGO', 600, 'P');
INSERT INTO LOAD_TAB1 VALUES (20110209, 'CHICAGO', 900, 'Z');

exec dbms_stats.gather_table_stats('TESTUSER', 'LOAD_TAB1', granularity => 'GLOBAL', method_opt => 'FOR ALL
COLUMNS SIZE 1', estimate_percent => 1);

ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110209_OTHERS WITH TABLE load_tab1;



3 March 2012                                            63                  Doug Burns – Hotsos Symposium 2011
ALTER TABLE test_tab1 MODIFY SUBPARTITION   P_20110209_OTHERS REBUILD UNUSABLE LOCAL INDEXES;

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110209'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110209'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110209%'
order by a.column_name, a.subpartition_name;

spool off




3 March 2012                                          64                      Doug Burns – Hotsos Symposium 2011
APPROX_GLOBAL1.SQL
@defs
spool approx_global1.txt

TRUNCATE TABLE TEST_TAB1;

exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   1000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   2000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   4000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   60000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   100, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   10, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   300, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'LONDON',   5000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'DALLAS',   6000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'SYDNEY',   7000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   8000, 'P');
COMMIT;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'APPROX_GLOBAL AND PARTITION');

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'SUBPARTITION', partname => 'P_20110201_MOSCOW');

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'SUBPARTITION', partname => 'P_20110201_LONDON');

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'SUBPARTITION', partname => 'P_20110201_SYDNEY');

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'SUBPARTITION', partname => 'P_20110201_OTHERS');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110201'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name



3 March 2012                                                    65            Doug Burns – Hotsos Symposium 2011
and a.partition_name='P_20110201'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110201%'
order by a.column_name, a.subpartition_name;

INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   1000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   2000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   4000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   60000, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   100, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   10, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   300, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   5000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   6000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   7000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   8000, 'P');
COMMIT;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'SUBPARTITION', partname => 'P_20110202_OTHERS');

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'SUBPARTITION', partname => 'P_20110202_MOSCOW');

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'SUBPARTITION', partname => 'P_20110202_LONDON');

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'SUBPARTITION', partname => 'P_20110202_SYDNEY');

select table_name, global_stats, last_analyzed, num_rows
from dba_tables
where table_name='TEST_TAB1'
and owner='TESTUSER'
order by 1, 2, 4 desc nulls last;

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110202'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,



3 March 2012                                                    66            Doug Burns – Hotsos Symposium 2011
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110202'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110202%'
order by a.column_name, a.subpartition_name;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'APPROX_GLOBAL AND PARTITION', partname => 'P_20110202');

select table_name, global_stats, last_analyzed, num_rows
from dba_tables
where table_name='TEST_TAB1'
and owner='TESTUSER'
order by 1, 2, 4 desc nulls last;

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110202'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110202'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name



3 March 2012                                            67                    Doug Burns – Hotsos Symposium 2011
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110202%'
order by a.column_name, a.subpartition_name;

INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   1000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   2000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   4000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   60000, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   100, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   10, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   300, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   5000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   6000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   7000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110203,   'DALLAS',   8000, 'P');
COMMIT;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'APPROX_GLOBAL AND PARTITION', partname => 'P_20110203');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110203'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110203'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110203%'
order by a.column_name, a.subpartition_name;

spool off




3 March 2012                                                    68            Doug Burns – Hotsos Symposium 2011
APPROX_GLOBAL2.SQL
@defs
spool approx_global2.txt

TRUNCATE TABLE TEST_TAB1;

exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   1000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   2000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   4000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   60000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   100, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   10, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   300, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'LONDON',   5000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'DALLAS',   6000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'SYDNEY',   7000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   8000, 'P');
COMMIT;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110201'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110201'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110201%'
order by a.column_name, a.subpartition_name;



3 March 2012                                                    69            Doug Burns – Hotsos Symposium 2011
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   1000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   2000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   4000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   60000, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   100, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   10, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   300, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   5000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   6000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   7000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   8000, 'P');
COMMIT;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'APPROX_GLOBAL AND PARTITION', partname => 'P_20110202');

select table_name, global_stats, last_analyzed, num_rows
from dba_tables
where table_name='TEST_TAB1'
and owner='TESTUSER'
order by 1, 2, 4 desc nulls last;

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110202'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110202'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110202%'
order by a.column_name, a.subpartition_name;


INSERT INTO TEST_TAB1 VALUES (20110203, 'DALLAS', 1000, 'P');



3 March 2012                                                    70            Doug Burns – Hotsos Symposium 2011
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   2000, 'U');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   4000, 'P');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   60000, 'N');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   100, 'P');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   10, 'N');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   300, 'N');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   5000, 'U');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   6000, 'P');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   7000, 'P');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   8000, 'P');
COMMIT;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity => 'DEFAULT');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110203'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110203'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110203%'
order by a.column_name, a.subpartition_name;


INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   1000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   2000, 'U');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   4000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   400, 'N');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   100, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   10, 'Z');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   300, 'Z');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   5000, 'U');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   6000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   1, 'P');



3 March 2012                                                     71           Doug Burns – Hotsos Symposium 2011
INSERT INTO TEST_TAB1 VALUES (20110204, 'CHICAGO', 80000, 'P');
COMMIT;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', granularity =>
'APPROX_GLOBAL AND PARTITION', partname => 'P_20110204');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110204'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110204'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110204%'
order by a.column_name, a.subpartition_name;

SPOOL OFF




3 March 2012                                         72                       Doug Burns – Hotsos Symposium 2011
COPY_STATS.SQL
@defs
@default
spool copy_stats.txt

ALTER TABLE TEST_TAB1
ADD PARTITION P_20110211 VALUES LESS THAN (20110212);

TRUNCATE TABLE LOAD_TAB1;
exec dbms_stats.delete_table_stats('TESTUSER', 'LOAD_TAB1');

INSERT   INTO   LOAD_TAB1   VALUES   (20110211,   'MOSCOW',   1000, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110211,   'MOSCOW',   30000, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110211,   'MOSCOW',   2000, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110211,   'MOSCOW',   10000, 'N');
INSERT   INTO   LOAD_TAB1   VALUES   (20110211,   'MOSCOW',   2400, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110211,   'MOSCOW',   500, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110211,   'MOSCOW',   1200, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110211,   'MOSCOW',   400, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110211,   'MOSCOW',   600, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110211,   'MOSCOW',   700, 'Z');

COMMIT;

ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110211_MOSCOW WITH TABLE load_tab1;
ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20110211_MOSCOW REBUILD UNUSABLE LOCAL INDEXES;

exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname => 'P_20110201',
dstpartname => 'P_20110211');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name IN ('P_20110201', 'P_20110211')
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name IN ('P_20110201', 'P_20110211')
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val
from dba_subpart_col_statistics a, dba_tab_cols b



3 March 2012                                                     73           Doug Burns – Hotsos Symposium 2011
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and (a.subpartition_name LIKE 'P_20110201%'
or a.subpartition_name LIKE 'P_20110211%')
order by a.column_name, a.subpartition_name;

ALTER TABLE TEST_TAB1
ADD PARTITION P_20110212 VALUES LESS THAN (20110213);

TRUNCATE TABLE LOAD_TAB1;
exec dbms_stats.delete_table_stats('TESTUSER', 'LOAD_TAB1');

INSERT   INTO   LOAD_TAB1   VALUES   (20110212,   'MOSCOW',   1000, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110212,   'MOSCOW',   30000, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110212,   'MOSCOW',   2000, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110212,   'MOSCOW',   10000, 'N');
INSERT   INTO   LOAD_TAB1   VALUES   (20110212,   'MOSCOW',   2400, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110212,   'MOSCOW',   500, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110212,   'MOSCOW',   1200, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110212,   'MOSCOW',   400, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110212,   'MOSCOW',   600, 'P');
INSERT   INTO   LOAD_TAB1   VALUES   (20110212,   'MOSCOW',   700, 'Z');

COMMIT;

ALTER TABLE test_tab1 EXCHANGE SUBPARTITION P_20110212_MOSCOW WITH TABLE load_tab1;
ALTER TABLE test_tab1 MODIFY SUBPARTITION P_20110212_MOSCOW REBUILD UNUSABLE LOCAL INDEXES;

exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
'P_20110201_MOSCOW', dstpartname => 'P_20110212_MOSCOW');

exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
'P_20110201_LONDON', dstpartname => 'P_20110212_LONDON');

exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
'P_20110201_SYDNEY', dstpartname => 'P_20110212_SYDNEY');

exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname =>
'P_20110201_OTHERS', dstpartname => 'P_20110212_OTHERS');

connect TESTUSER/TESTUSER

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name IN ('P_20110201', 'P_20110212')
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val
from user_part_col_statistics a, user_tab_cols b



3 March 2012                                                     74           Doug Burns – Hotsos Symposium 2011
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name IN ('P_20110201', 'P_20110212')
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and (a.subpartition_name LIKE 'P_20110201%'
or a.subpartition_name LIKE 'P_20110212%')
order by a.column_name, a.subpartition_name;

exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname => 'P_20110201',
dstpartname => 'P_20110212');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name IN ('P_20110201', 'P_20110212')
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name IN ('P_20110201', 'P_20110212')
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and (a.subpartition_name LIKE 'P_20110201%'
or a.subpartition_name LIKE 'P_20110212%')
order by a.column_name, a.subpartition_name;

spool off




3 March 2012                                           75                     Doug Burns – Hotsos Symposium 2011
INCREMENTAL.SQL
@defs
spool incremental.txt

TRUNCATE TABLE TEST_TAB1;

exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')
exec dbms_stats.set_table_prefs('TESTUSER','TEST_TAB1','INCREMENTAL','TRUE')

INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   1000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   2000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   4000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   60000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   100, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   10, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   300, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'LONDON',   5000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'DALLAS',   6000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'SYDNEY',   7000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110201,   'MOSCOW',   8000, 'P');
COMMIT;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name='P_20110201'
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name='P_20110201'
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110201%'



3 March 2012                                                    76             Doug Burns – Hotsos Symposium 2011
order by a.column_name, a.subpartition_name;

INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   1000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   2000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   4000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   60000, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   100, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   10, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   300, 'N');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   5000, 'U');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   6000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   7000, 'P');
INSERT INTO    TEST_TAB1   VALUES   (20110202,   'CHICAGO',   8000, 'P');
COMMIT;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1');

select table_name, global_stats, last_analyzed, num_rows
from dba_tables
where table_name='TEST_TAB1'
and owner='TESTUSER'
order by 1, 2, 4 desc nulls last;

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name IN ('P_20110201', 'P_20110202')
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name IN ('P_20110201', 'P_20110202')
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110202%'
or a.subpartition_name LIKE 'P_20110201%'
order by a.column_name, a.subpartition_name;

INSERT INTO TEST_TAB1 VALUES (20110203, 'DALLAS', 1000, 'P');



3 March 2012                                                    77            Doug Burns – Hotsos Symposium 2011
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   2000, 'U');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   4000, 'P');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   60000, 'N');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   100, 'P');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   10, 'N');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   300, 'N');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   5000, 'U');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   6000, 'P');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   7000, 'P');
INSERT INTO     TEST_TAB1   VALUES   (20110203,   'DALLAS',   8000, 'P');
COMMIT;

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1');

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and partition_name IN ('P_20110201', 'P_20110202', 'P_20110203')
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name IN ('P_20110201', 'P_20110202', 'P_20110203')
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110201%'
or a.subpartition_name LIKE 'P_20110202%'
or a.subpartition_name LIKE 'P_20110203%'
order by a.column_name, a.subpartition_name;

INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   1000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   2000, 'U');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   4000, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   400, 'N');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   100, 'P');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   10, 'Z');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   300, 'Z');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   5000, 'U');
INSERT   INTO   TEST_TAB1   VALUES   (20110204,   'CHICAGO',   6000, 'P');



3 March 2012                                                     78           Doug Burns – Hotsos Symposium 2011
INSERT INTO TEST_TAB1 VALUES (20110204, 'CHICAGO', 1, 'P');
INSERT INTO TEST_TAB1 VALUES (20110204, 'CHICAGO', 80000, 'P');
COMMIT;

alter session set tracefile_identifier='INCREMENTAL';
alter session set events='10046 trace name context forever, level 12';

exec dbms_stats.gather_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1');

alter session set events='10046 trace name context off';

select table_name, global_stats, last_analyzed, num_rows
from user_tables
where table_name='TEST_TAB1';

select table_name, partition_name, global_stats, last_analyzed, num_rows
from user_tab_partitions
where table_name='TEST_TAB1'
order by partition_name;

select table_name, subpartition_name, global_stats, last_analyzed, num_rows
from user_tab_subpartitions
where table_name='TEST_TAB1'
and   partition_name IN ('P_20110201', 'P_20110202', 'P_20110203', 'P_20110204')
order by subpartition_name;

select a.column_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val, b.data_type
from user_tab_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and   a.table_name=b.table_name
and   a.column_name=b.column_name
order by a.column_name;

select a.column_name, a.partition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from user_part_col_statistics a, user_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.partition_name IN ('P_20110201', 'P_20110202', 'P_20110203', 'P_20110204')
order by a.column_name, a.partition_name;

select a.column_name, a.subpartition_name, a.num_distinct,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type
from dba_subpart_col_statistics a, dba_tab_cols b
where a.table_name='TEST_TAB1'
and a.table_name=b.table_name
and a.column_name=b.column_name
and a.subpartition_name LIKE 'P_20110201%'
and a.subpartition_name LIKE 'P_20110202%'
and a.subpartition_name LIKE 'P_20110203%'
and a.subpartition_name LIKE 'P_20110204%'
order by a.column_name, a.subpartition_name;

select   reporting_date, count(*) from test_tab1 group by reporting_date;
select   source_system, count(*) from test_tab1 group by source_system;
select   status, count(*) from test_tab1 group by status;
select   seq_id, count(*) from test_tab1 group by seq_id;

SPOOL OFF




3 March 2012                                           79                   Doug Burns – Hotsos Symposium 2011

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:7
posted:3/4/2012
language:
pages:79