Welcome, Guest

Oracle SQL Tuning: Optimizer Statistics

Options

Previous Topic

Next Topic

	codeling 1595 - 6639
	@2016-02-05 20:59:08

Statistics used by the optimizer when generating the execution plan are stored in the data dictionary. Statistics about physical storage characteristics and data distribution in the various schema objects are generated by using the DBMS_STATS package.

To get the right execution plans, statistics should be maintained not just when the database is created and after the initial load of data but on a regular basis based on the frequency in which the data volume changes. If no statistics are available or the statistics are stale, the execution plans generated by the optimizer may not be efficient enough. Under these conditions, the optimizer performs dynamic sampling depending on the setting of the OPTIMIZER_DYNAMIC_SAMPLE parameter. This sampling may cause slower parse times; so for best performance, the optimizer should have representative optimizer statistics.

Statistics at the table level include number of rows, number of blocks, and row length. Column statistics such as number of distinct values, number of nulls in columns, data distribution, or histograms. Index statistics include number of leaf blocks, levels, and clustering factors. System statistics include I/O performance and CPU performance. All of these statistics are important and change with the volume of data changes or if the structure of the objects changes. The statistics collected are then maintained in the data dictionary tables of the database.

Starting with Oracle database 10g, Oracle automatically starts gathering statistics once every day. Although Oracle recommends using the automated process, in certain cases, it may be required to use homegrown procedures to collect statistics, either because such a frequent collection interval may not be required or the automated collection process maybe scheduled with conflicts with other processes.

The following procedure could be used to verify if the automatic gathering of statistics has been enabled:

SELECT program_action,

number_of_arguments NOA,

enabled

FROM dba_scheduler_programs

WHERE owner = 'SYS'

AND program_name LIKE 'GATHER%';

PROGRAM_ACTION                                NOA        ENABL

--------------------------------------------- ---------- -----

dbms_stats.gather_database_stats_job_proc     0          TRUE

DBA_AUTOTASK_TASK displays all jobs that have been scheduled for automatic execution. As seen from the query output following, the GATHER_STATS_PROG is enabled for automatic statistics collection:

SQL> SELECT TASK_NAME, STATUS FROM DBA_AUTOTASK_TASK;

TASK_NAME                                      STATUS

---------------------------------------------- --------

AUTO_SQL_TUNING_PROG                           ENABLED

auto_space_advisor_prog                        ENABLED

gather_stats_prog                              ENABLED

The GATHER_DATABASE_STATS_JOB_PROC prioritizes database objects that require statistics so that objects that most need updated statistics are processed first, before the maintenance window closes.

Statistics can also be gathered manually using the DBMS_STATS package. It is used to gather statistics on tables and indexes and individual columns and partitions of tables. However, it does not gather cluster statistics. This is accomplished by collecting statistics on the individual tables instead of the whole cluster. DBMS_STATS collects statistics at the following levels:

GATHER_INDEX_STATS
GATHER_TABLE_STATS
GATHER_SCHEMA_STATS
GATHER_DICTIONARY_STATS
GATHER_DATABASE_STATS

When gathering statistics on a table, DBMS_STATS gathers information about the data distribution of the columns within the table. The most basic information about the data distribution, such as the maximum and minimum values for the column, may not be sufficient for the optimizer’s needs if the data within the column is skewed. For skewed data distributions, histograms will have to be created as part of the column statistics to describe the data distribution of a given column. This is particularly helpful for data warehouse implementations where skewed data is normally present, and histograms help the optimizer to generate efficient execution plans. Histograms provide improved selectivity estimates in the presence of data skew, resulting in optimal execution plans with nonuniform data distributions.

@2016-02-05 21:48:16

OPTIMIZER_DYNAMIC_SAMPLING

Dynamic sampling is controlled by the parameter OPTIMIZER_DYNAMIC_SAMPLING, which can be set to different levels (0-10). These levels control two different things; when dynamic sampling kicks in and how large a sample size will be used to gather the statistics. The greater the sample size the bigger impact DS has on the compilation time of a query.

Level	When Dynamic Sampling will be used	Sample size (blocks)
0	Switches off dynamic sampling	N/A
1	At least one non-partitioned table in the statement has no statistics	32
2 (default)	One or more tables in the statement have no statistics	64
3	Any statement that meets level 2 criteria and any statement that has one or more expressions used in the where clause predicates e.g. Where substr(CUSTLASTNAME,1,3) or Where a + b =5	64
4	Any statement that meets level 3 criteria and any statement that has complex predicates. An OR or AND operator between multiple predicates on the same table	64
5	Any statement that meets level 4 criteria	128
6	Any statement that meets level 4 criteria	256
7	Any statement that meets level 4 criteria	512
8	Any statement that meets level 4 criteria	1024
9	Any statement that meets level 4 criteria	4086
10	All statements	All Blocks

To show the value of the parameter, use the following SQL statement or command:

SELECT NAME, VALUE
FROM v$parameter
where name = 'optimizer_dynamic_sampling';

show parameter optimizer_dynamic_sampling.


NAME                         TYPE        VALUE 
---------------------------- ----------- -------
optimizer_dynamic_sampling   integer     2

To change the default value, use the following statement:

Alter session set optimizer_dynamic_sampling=4;

By setting OPTIMIZER_DYNAMIC_SAMPLING to level 4, the optimizer will use dynamic sampling to gather additional information about the complex predicate expression. The additional information provided by dynamic sampling allows the optimizer to generate a more accurate cardinality estimate and therefore a better performing execution plan.

Previous Topic

Next Topic

Comments

You must Sign In to comment on this topic.

Print this topic

Watch this topic

Normal

Threaded