Statistics used by the optimizer when generating the execution plan are stored in the data dictionary. Statistics about physical storage characteristics and data distribution in the various schema objects are generated by using the DBMS_STATS package.
To get the right execution plans, statistics should be maintained not just when the database is created and after the initial load of data but on a regular basis based on the frequency in which the data volume changes. If no statistics are available or the statistics are stale, the execution plans generated by the optimizer may not be efficient enough. Under these conditions, the optimizer performs dynamic sampling depending on the setting of the OPTIMIZER_DYNAMIC_SAMPLE parameter. This sampling may cause slower parse times; so for best performance, the optimizer should have representative optimizer statistics.
Statistics at the table level include number of rows, number of blocks, and row length. Column statistics such as number of distinct values, number of nulls in columns, data distribution, or histograms. Index statistics include number of leaf blocks, levels, and clustering factors. System statistics include I/O performance and CPU performance. All of these statistics are important and change with the volume of data changes or if the structure of the objects changes. The statistics collected are then maintained in the data dictionary tables of the database.
Starting with Oracle database 10g, Oracle automatically starts gathering statistics once every day. Although Oracle recommends using the automated process, in certain cases, it may be required to use homegrown procedures to collect statistics, either because such a frequent collection interval may not be required or the automated collection process maybe scheduled with conflicts with other processes.
The following procedure could be used to verify if the automatic gathering of statistics has been enabled:
WHERE owner = 'SYS'
AND program_name LIKE 'GATHER%';
PROGRAM_ACTION NOA ENABL
--------------------------------------------- ---------- -----
dbms_stats.gather_database_stats_job_proc 0 TRUE
DBA_AUTOTASK_TASK displays all jobs that have been scheduled for automatic execution. As seen from the query output following, the GATHER_STATS_PROG is enabled for automatic statistics collection:
SQL> SELECT TASK_NAME, STATUS FROM DBA_AUTOTASK_TASK;
The GATHER_DATABASE_STATS_JOB_PROC prioritizes database objects that require statistics so that objects that most need updated statistics are processed first, before the maintenance window closes.
Statistics can also be gathered manually using the DBMS_STATS package. It is used to gather statistics on tables and indexes and individual columns and partitions of tables. However, it does not gather cluster statistics. This is accomplished by collecting statistics on the individual tables instead of the whole cluster. DBMS_STATS collects statistics at the following levels:
When gathering statistics on a table, DBMS_STATS gathers information about the data distribution of the columns within the table. The most basic information about the data distribution, such as the maximum and minimum values for the column, may not be sufficient for the optimizer’s needs if the data within the column is skewed. For skewed data distributions, histograms will have to be created as part of the column statistics to describe the data distribution of a given column. This is particularly helpful for data warehouse implementations where skewed data is normally present, and histograms help the optimizer to generate efficient execution plans. Histograms provide improved selectivity estimates in the presence of data skew, resulting in optimal execution plans with nonuniform data distributions.
Dynamic sampling is controlled by the parameter OPTIMIZER_DYNAMIC_SAMPLING, which can be set to different levels (0-10). These levels control two different things; when dynamic sampling kicks in and how large a sample size will be used to gather the statistics. The greater the sample size the bigger impact DS has on the compilation time of a query.
To show the value of the parameter, use the following SQL statement or command:
SELECT NAME, VALUE
where name = 'optimizer_dynamic_sampling';
show parameter optimizer_dynamic_sampling.
NAME TYPE VALUE
---------------------------- ----------- -------
optimizer_dynamic_sampling integer 2
To change the default value, use the following statement:
Alter session set optimizer_dynamic_sampling=4;
By setting OPTIMIZER_DYNAMIC_SAMPLING to level 4, the optimizer will use dynamic sampling to gather additional information about the complex predicate expression. The additional information provided by dynamic sampling allows the optimizer to generate a more accurate cardinality estimate and therefore a better performing execution plan.
© 2020 Digcode.com. All rights reserved.