How to understand Oracle statistics 07/13 Update SLTechnology News&Howtos

How to understand Oracle statistics

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "how to understand Oracle statistics". In daily operation, I believe many people have doubts about how to understand Oracle statistics. The editor consulted all kinds of data and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubts of "how to understand Oracle statistics"! Next, please follow the editor to study!

Commonly used script DBMS_STATS.GATHER_TABLE_STATS parameters for statistics

Dbms_stats.gather_table_stats (

Owner VARCHAR2

Tablename VARCHAR2

Partname VARCHAR2

Estimate_percent NUMBER

Block_sample BOOLEAN

Method_opt VARCHAR2

Degree NUMBER

Granularity VARCHAR2

Cascade BOOLEAN

Stattab VARCHAR2

Statid VARCHAR2

Statown VARCHAR2

No_invalidate BOOLEAN

Force BOOLEAN

)

Parameter description

1.owner: the owner of the table to analyze

2.tablename: the table name of the table to be analyzed

3.partname: partition name

4.estimate_percent: the percentage of sampled rows, from 0.000001-100 zero to full analysis, no sampling. The constant DBMS_STATS.AUTO_SAMPLE_SIZE is the default value, and the best sampling rate is determined by Oracle.

5.block_sample: whether to use block sampling instead of row sampling.

6.method_opt: determines how histograms information is counted. The value of method_opt is as follows:

For all columns: count all histograms

For all indexed columns: count the histograms of all index columns

For all hidden coloumns: the histograms of the statistical hidden column

For columns SIZE | REPEAT | AUTO | the value range of histograms,N for the specified column in SKEWONLY statistics is 0-254.

7.degree: sets the parallelism of statistics collection. The default value is null.

8.cascade: collects statistics of the index. Default is false.

9.stattab: specifies the table in which statistics are stored.

10.statid: if statistics for multiple tables are stored in one stattab, statid is used as a partitioning condition.

11.statown: stores all the books of the statistics table.

If you do not specify the above three parameters, the statistics are updated to the data dictionary.

12.force: collect statistics even if the table is locked.

Non-partition table:

BEGIN

DBMS_STATS.GATHER_TABLE_STATS (ownname = > 'SCOTT'

Tabname = > 'DEPT'

Estimate_percent = > 30

Method_opt = > 'for all columns size repeat'

No_invalidate = > FALSE

Degree = > 8

Cascade = > TRUE)

END

Collect statistics on partition tables

BEGIN

DBMS_STATS.GATHER_TABLE_STATS (ownname = > 'ROBINSON'

Tabname = > 'playing test'

Estimate_percent = > 30

Method_opt = > 'for all columns size repeat'

No_invalidate = > FALSE

Degree = > 8

Granularity = > 'ALL'

Cascade = > TRUE)

END

Note: the statistics of partitions are merged into DBA_TABLES

Explanation of parameters in script

Estimate_percent

Indicates the sampling rate, which is too high and unnecessary to use the dbms_stats.auto_sample_size option to allow Oracle to automatically estimate the optimal percentage of a segment to sample.

If the table is very large, the high sampling rate will lead to a long run to collect statistics and increase the pressure on the system.

If the sampling rate is set too small, the statistical information can not fully reflect the distribution of the data in the table, so CBO is likely to choose the wrong execution plan.

Based on work experience:

If the sampling rate of the table is less than 1GB, you can set 50% Mutual 100%.

Table greater than 1GB less than 5GB can be set to 30%

Tables larger than 5GB should be partitioned, and the sampling rate can be set to 30%.

Method_opt consists of two parts.

Represents the collection method, and the parameters are divided into two parts

This part for all [indexed | hidden] columns "

Controls which columns will collect basic statistics (minimum and maximum values on the target column, number of different values on the column, number of null values, etc.). The system defaults to for all columns, which collects basic statistics for all columns on the table, including hidden columns. In addition, its other optional values are as follows:

FOR ALL INDEXED COLUMNS

Specifies that only fields that contain indexes can collect basic statistics for columns. This option value is generally not recommended because fields used in all sql statements in a database environment, such as fields after select, fields behind where, and fields in group by, do not just refer to fields that contain indexes.

FOR ALL HIDDEN COLUMNS

Specify all invisible fields in the table to collect the basic statistics of the columns, that is, do not collect the statistics of the actually visible columns on the table. This option value is also generally not recommended. This option is usually used only in this case. Add one or more invisible or virtual columns to a table where all column statistics are accurate. You only need to collect statistics for this or these invisible columns, instead of repeating statistics for other columns, then use the for all hidden columns option.

Part II "Size [size_clause]"

Controls how histograms are collected, with the following options available after size

AUTO Oracle decides which columns need to collect histograms based on column statistics (sys.col_usage$) and column data skew (uniform distribution).

Integer specifies the number of barrels to collect histograms, with a minimum of 1 and a maximum of 254 (for 11g and earlier versions, there is no limit after 12c). Note that if the number of barrels is 1, size 1 means that the histogram is not created, and if the column already has a histogram, the histogram of that column will be deleted.

REPEAT only recollects histograms on columns that already have histograms. Repeat ensures that the histogram is re-collected at the global level for columns that already have histograms. This option is generally not recommended because the number of barrels used in the new histogram will not exceed the number of barrels in the old histogram. Assuming that the number of buckets in the current histogram is 5, when using size repeat to re-collect the histogram, the number of buckets used in the new histogram will not exceed 5, which may not achieve good results.

SKEWONLY collects histograms only on columns where data is unevenly distributed. Let ORACLE decide for himself whether the column collects the histogram. As long as the column is skewed, ORACLE will collect the histogram. The OLTP system uses this very tricky. Basically all columns have to collect the histogram.

If for all columns size auto, the default parameter of method_opt, is not applicable to your data environment, you may encounter the following two situations:

1. Create a histogram on a column other than the one specified

two。 Create a histogram only on the specified column

When collecting statistics, a stable system is recommended to use method_opt= > 'for all columns size repeat',repeat to indicate that histograms have been collected before, and histograms are now collected when statistics are collected. If histograms have not been collected before, statistics are not collected now.

Sometimes when collecting statistics, using method_opt = > 'for all columns size auto', is likely to take care of the current sql, but it makes other sql tragic, because auto means that Oracle filters information based on predicates (where condition filtering mentioned earlier in the histogram) to automatically determine whether the column collects histograms. A stable system should not let Oracle automatically judge, automatic judgment is likely to have an accident, such as a certain column should not collect histograms, after setting auto, it collects histograms on its own, resulting in system instability.

Options

Controls how Oracle statistics are refreshed:

Gather: reanalyze the entire architecture

Gather empty: analyze only tables that do not have statistics yet

Gather stale: only reanalyze tables with more than 10% modifications (including inserts, updates, and deletions)

Gather auto: re-analyze objects that currently have no statistics, and objects whose statistics are out of date (dirty). Using gather auto is similar to using gather stale and gather empty in combination

Degree

Indicates the degree of parallelism when collecting statistics, which is set according to your system configuration and current system available resources. The general degree setting is 4Mui Mui 8. Generally speaking, a CPU can open 2 threads.

DEGREE equals show parameter cpu.

If you use parallel 8 to collect statistics, it is very likely to open 17 processes, one process as the main process to coordinate the other 16 parallel processes, 8 processes to read data and 8 processes to perform CPU operations for analysis.

Cascade

Indicates that the statistics of the index are collected as well as the statistics of the table. In fact, collecting statistics on the index is very tricky, because the statistics collected by the index are read in a single block.

No_invalidate

Indicates whether SQL cursors that reference related tables in the shared pool are invalid after collecting statistics. This must be set to FALSE, the default is TRUE, otherwise you may find that when you do SQL optimization, you may find that the statistics have been updated, but the implementation plan has not changed.

Granularity

Statistics collection, 'ALL'-collect all (sub-partition, partition and global) statistics

① ALL: collect granularity statistics such as Global, partition, subpartition, etc.

② AUTO: the granularity of statistical information collection is determined by Oracle according to the partition type.

③ PARTITION: only partition granularity statistics are collected.

④ SUBPARTITION: only collect subpartition granularity statistics

Partname

A partition name of the partition table

Determine whether the statistics of a table are out of date script

Exec dbms_stats.flush_database_monitoring_info

-- Refresh sys.col_usage$ and View: sys.DBA_TAB_MODIFICATIONS

Select owner, table_name name, object_type, stale_stats, last_analyzed

From dba_tab_statistics

Where table_name in (table_name)

And owner = 'OWNER_NAME'

And (stale_stats = 'YES' or last_analyzed is null)

Experiment 1 to see if the statistics are out of date

1. Create an experimental table

CREATE TABLE TEST AS SELECT * FROM DBA_OBJECTS

two。 Collect statistics

BEGIN

DBMS_STATS.GATHER_TABLE_STATS (ownname = > 'SCOTT'

Tabname = > 'TEST'

Estimate_percent = > 100

Method_opt = > 'for all columns size auto'

No_invalidate = > FALSE

Degree = > 1

Cascade = > TRUE)

END

3. Refresh

Exec dbms_stats.flush_database_monitoring_info

4. Check whether the statistical information of the test table is out of date. A blank line indicates that it has not expired.

Select owner, table_name name, object_type, stale_stats, last_analyzed from dba_tab_statistics where table_name in ('TEST') and owner =' SCOTT' and (stale_stats = 'YES' or last_analyzed is null)

-result blank line-

5. Delete 20% of the data and let the statistics expire

Select count (*) from test

Delete from test where rownum 'TEST'

Estimate_percent = > 30

Method_opt = > 'for all columns size auto'

No_invalidate = > FALSE

Degree = > 1

Cascade = > TRUE)

END

3 View statistics

Select a.column_name

B.num_rows

A.num_distinct Cardinality

Round (a.num_distinct / b.num_rows * 100,2) selectivity

A.histogram

A.num_buckets

From dba_tab_col_statistics a, dba_tables b

Where a.owner = b.owner

And a.table_name = b.table_name

And a.owner = 'SCOTT'

And a.table_name = 'TEST'

From the above results, we find that HISTOGRAM returns that none does not have histogram information because we do not have a select query

4 then we execute the select query

SELECT COUNT (*) FROM TEST WHERE OWNER='SCOTT'

Collect statistics again as in step 2 and check again to see if the histogram is collected as in step 3

If you find that there are where conditions, you can collect histograms

The method of experimental size repeat

1 create a new experimental table

Create table test as select * from dba_objects

two。 Collect statistics, the size repeat we use here only recollects histograms on columns that already have histograms. Repeat ensures that the histogram is re-collected at the global level for columns that already have histograms.

BEGIN

DBMS_STATS.GATHER_TABLE_STATS (ownname = > 'SCOTT'

Tabname = > 'TEST'

Estimate_percent = > 30

Method_opt = > 'for all columns size repeat'

No_invalidate = > FALSE

Degree = > 1

Cascade = > TRUE)

END

3 View statistics

Select a.column_name

B.num_rows

A.num_distinct Cardinality

Round (a.num_distinct / b.num_rows * 100,2) selectivity

A.histogram

A.num_buckets

From dba_tab_col_statistics a, dba_tables b

Where a.owner = b.owner

And a.table_name = b.table_name

And a.owner = 'SCOTT'

And a.table_name = 'TEST'

There is no histogram information.

The experiment collects histograms from a column (the owner column of the test table)

BEGIN

DBMS_STATS.GATHER_TABLE_STATS (ownname = > 'SCOTT'

Tabname = > 'TEST'

Estimate_percent = > 30

Method_opt = > 'for owner columns size skewonly'

No_invalidate = > FALSE

Degree = > 1

Cascade = > TRUE)

END

Here for all is changed to for owner.

At this point, the study on "how to understand Oracle statistics" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.