ORACLE collects statistics 07/02 Update SLTechnology News&Howtos

ORACLE collects statistics

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Today, I saw an article about collecting statistical information on the Internet, which is not bad, so I specially reprint it. 1. Understand what statistics are

Optimizer statistics are a collection that describes the database and database objects in more detail and is used to query the optimizer to select the best execution plan for each SQL statement. Optimizer statistics include: table statistics

O number of rows

O Block number o average line length

Statistics for the column

Number of different values in the o column

Number of null in column o Statistics of data distribution (histogram / histogram) index

O number of leaf blocks

O height of the index

O Statistical information of clustering factor (clustering factor) system

O CPU performance and optimizer statistics are stored in the following data dictionaries

DBA_TABLES DBA_OBJECT_TABLES DBA_TAB_STATISTICS DBA_TAB_COL_STATISTICS DBA_TAB_HISTOGRAMS DBA_INDEXES DBA_IND_STATISTICS DBA_CLUSTERS DBA_TAB_PARTITIONS DBA_TAB_SUBPARTITIONS DBA_IND_PARTITIONS DBA_IND_SUBPARTITIONS DBA_PART_COL_STATISTICS DBA_PART_HISTOGRAMS DBA_SUBPART_COL_STATISTICS DBA_SUBPART_HISTOGRAMS INDEX_STATS storage ANALYZE.. VALIDATE STRUCTURE statistics AUX_STATS$ storage CPU system The count information X$KCFIO stores Istroke O statistics because the objects in the database change frequently. So statistics must be updated regularly to describe these database objects more accurately. Statistics are maintained automatically by ORACLE by default, but we can also collect statistics manually using the DBMS_STATS package. The DBMS_STATS package also provides procedures to maintain statistics. For a more detailed description of the DBMS_STATS package, see the PL/SQL Packages and Types Reference section of the official documentation.

two。 Automatically collect statistics

In Oracle10g, when you install Oracle, a job named GATHER_STATS_JOB is created by default to automatically collect optimizer statistics. This job collects statistics for all objects in the database. By default, the job collects statistics from 10:00 to 6am from Monday to Friday and throughout the weekend. You can view the DBA_SCHEDULER_JOBS, DBA_SCHEDULER_PROGRAMS,DBA_SCHEDULER_WINDOWS,DBA_SCHEDULER_JOB_RUN_DETAILS and other views to see the JOB settings and operation information. Automatic collection of expired statistics depends on table monitoring features. Table monitoring is enabled by default in Oracle10g, and it also depends on the value of STATISTICS_LEVEL parameter. In 10g, it defaults to typical. Only by setting the STATISTICS_LEVEL parameter to ALL or TYPICAL can ORACLE identify expired statistics. 3. Turn off automatic collection of statistics

In some cases, we want to turn off automatic collection of statistics, so we can use the following methods: BEGIN DBMS_SCHEDULER.DISABLE ('GATHER_STATS_JOB'); END; / 4. When to collect statistics manually

Sometimes automatic collection of statistics is not appropriate because automatic collection of statistics runs at midnight, but because the object is modified during the day, the statistics become obsolete There are two kinds of such objects: tables that are often delete during the day, or tables that are rebuild after truncated (tables that change frequently) tables whose data is changed by 10% or more after batch operations (tables processed in batches). For frequently changed tables, you can set their statistics to null. When ORACLE encounters a table with no statistics, ORACLE will dynamically sample to collect the necessary statistics for the query optimizer. Dynamic sampling is controlled by the parameter optimizer_dynamic_sampling, and its default value is 2. At the same time, optimizer_mode can also control dynamic sampling, which can be set to all. Taking the DEPT table under the SCOTT user as an example, the method to set the statistics of a table to null is as follows: BEGIN DBMS_STATS.DELETE_TABLE_STATS ('SCOTT','DEPT'); DBMS_STATS.LOCK_TABLE_STATS (' SCOTT','DEPT'); END We can also collect statistics when the table is typical and representative, and lock its statistics, because the statistics automatically collected at night may not be suitable for the load during the day, and the typical statistics are representative, so adopting lock's typical statistics at this time can better enable CBO to choose a better implementation plan. As for which of the above two methods to use, this also needs to be analyzed according to the actual situation of the business.

For tables processed in bulk, statistics should be collected as soon as the batch processing is complete, and the script that collects statistics can be bound to the script for batch processing.

For external tables, statistics can only be collected through the gather_table_stats process, and external tables do not support sampling, so you need to set estimate_percent in gather_table_stats to null. Statistics for the system also need to be collected manually because they are not collected automatically.

For fixed objects, such as dynamic performance tables, you need to manually execute the gather_fixed_objects_stats process to collect. Fixed objects reflect the activity of the current database. When database activity is representative, such statistics should be collected. 5. Lock / unlock statistics LOCK_SCHEMA_STATS LOCK_TABLE_STATS UNLOCK_SCHEMA_STATS UNLOCK_TABLE_STATS 6. Collect statistics manually

If you choose to collect statistics manually, you need to manually collect statistics for all users, including system users. If the data in your database changes regularly, you can collect statistics regularly so that the statistics can accurately reflect the characteristics of the objects in the database.

You can use the DBMS_STATS package to collect statistics for tables, indexes, columns, and partitioned tables. DBMS_STATS cannot collect statistics for CLUSTER, but you can collect statistics for a single table instead of collecting statistics for the entire CLUSTER. When you collect statistics for tables, columns, and indexes, if ORACLE finds that the object has collected statistics in the data dictionary, then ORACLE will update the existing statistics, the old statistics will be saved, and you can restore the old statistics if you like. You can use DBMS_STATS.GATHER_DICTIONARY_STATS to collect statistics for system users, which collects statistics for all system users, including SYS and SYSTEM, as well as other users, such as CTXSYS,DRSYS. When the statistics of the database object are updated, ORACLE will invalidate the parsed SQL statement, when the SQL statement is run again, ORACLE will reparse the SQL, and the optimizer will automatically select a new execution plan based on the new statistics. For distributed databases, it will not be invalidated. The process of collecting statistics

O GATHER_INDEX_STATS-collect index statistics o GATHER_TABLE_STATS-collect table, column, index statistics o GATHER_SCHEMA_STATS-collect schema all object statistics o GATHER_DICTIONARY_STATS-collect all system user statistics o GATHER_DATABASE_STATS-collect database all object statistics We use the above process to collect statistics there are several parameters that need to be concerned

O sampling

O parallel

O Subarea

O column statistics and histogram / histogram

O expired statistics

O Custom statistics

We can use sampling to evaluate statistics during the operation of collecting statistics. Sampling is a very important technique for collecting statistical information. If you do not use sampling when collecting statistics, you need to do a full table scan of the table and sort the entire table. By sampling, you can reduce the resources required to collect the necessary statistics.

The parameter that controls sampling is ESTIMATE_PERCENT, and the sampling parameter can be set to any value (within the range, of course), but ORACLE recommends setting ESTIMATE_PERCENT to DBMS_STATS.AUTO_SAMPLE_SIZE. AUTO_SAMPLE_SILE can let ORACLE decide the best sample value for itself, because different types of statistics (table,index,column) have different requirements. Example of sampling: EXEC DBMS_STATS.GATHER_SCHEMA_STATS ('SCOTT',DBMS_STATS.AUTO_SAMPLE_SIZE); when the ESTIMATE_PERCENT parameter is specified manually, if the manually specified parameter is too small to collect enough information, then DBMS_STATS may automatically increase the value of ESTIMATE_PERCENT to ensure that sufficient statistics are collected. We can collect statistics either serially or in parallel. The parameter DEGREE controls whether the DBMS_STATS uses the parallel feature. ORACLE recommends setting the DEGREE parameter to DBMS_STATS.AUTO_DEGREE. After this setting, ORACLE can determine an appropriate degree of parallelism and collect statistics based on the SIZE of OBJECT and the init parameters related to parallelism. Note: cluster index,domain index,bitmap join index cannot use the parallel feature. For partitioned tables and partitioned indexes, DBMS_STATS can collect partition statistics individually or for the entire table / index. For combined partitions, DBMS_STATS can also collect statistics for subpartitions, partitions, and entire tables / indexes. The parameter GRANULARITY controls the collection of partition statistics. Because partition statistics, global statistics are very important for most systems, ORACLE recommends setting it to AUTO to collect partition and global statistics. When collecting statistics on a table, DBMS_STATS collects data distribution information for the columns. The most basic statistics of data distribution are the maximum and minimum values of this column. If this column is skewed, the optimizer cannot make an accurate execution plan based solely on the maximum and minimum values of the column. For skewed data distribution, we can collect histogram / histogram statistics for columns, which allows the optimizer to make a more accurate execution plan. The parameter METHOD_OPT controls the collection of histograms. ORACLE recommends setting METHOD_OPT to FOR ALL COLUMNS SIZE AUTO. After this setting, ORACLE will automatically determine which column needs to collect the bar chart, and automatically set the bucket of the bar chart. You can also manually set which column needs to collect the bar chart, as well as the bucket of the bar chart. To know whether the statistics are out of date, ORACLE provides table monitoring capabilities. Set the init parameter STATISTICS_LEVEL to ALL or TYPICAL (default) to enable table monitoring (10g no longer requires alter table monitor). The table monitoring function tracks the insert,update,delete,truncate, operations of the table, and records it in the DBA_TAB_MODIFICATIONS view. When we query the DBA_TAB_MODIFICATIONS view, we may not be able to query the results, or the results of the query are not accurate, so we need to use the DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO process to refresh the information in memory to the view. Setting the OPTIONS parameter to GATHER STALE or GATHER AUTO causes DBMS_STATS to determine whether the statistics of the table are out of date (note that this parameter is not available in GATHER_TABLE_STATS, only in the GATHER_DATABASE_STATS,GATHER_SCHEMA_STATS procedure). Whether the statistical information of the table is out of date is based on whether more than 10% of the data has been modified. If so, ORACLE thinks that the previous statistical information is out of date, and ORACLE will collect the statistical information again. After we have created the functional index, we need to collect statistics for the column, at which point we need to set the parameter METHOD_OPT to FOR ALL HIDDEN COLUMNS. 7. Strategies for collecting statistics

Normally, we will turn off the automatic collection of statistics in ORACLE, and we will collect statistics from the database manually. The strategy for collecting statistics needs to be determined according to the system. Here are a few common situations: if the data of the tables in your system is increasing incrementally (regularly), that is, you hardly do any batch processing operations, such as batch deletions and bulk loads. It is very simple to collect statistics for such a table. You can observe the changes in the table by looking at the DBA_TAB_MODIFICATIONS view, see if the change in the amount of data in the table is more than 10%, and record the number of days. In this way, you can collect statistics from it at such intervals. You can use CRONTAB or JOB to call GATHER_SCHEMA_STATS or GATHER_TABLE_STATS procedures to collect statistics. For tables that often operate in bulk, the statistics of the table must be collected after the batch operation.

For partitioned tables, usually only one partition is modified, in which case only individual partition statistics can be collected, but it is necessary to collect statistics for the entire table.

Finally, I'll give you two scripts to determine whether the table needs to collect statistics.

8. Some examples of collecting statistics

Example 1 collects statistics BEGIN for tables

DBMS_STATS.GATHER_TABLE_STATS (ownname = > 'SCOTT'

Tabname = > 'DEPT'

Estimate_percent = > DBMS_STATS.AUTO_SAMPLE_SIZE

Method_opt = > 'for all columns size repeat'

Degree = > DBMS_STATS.AUTO_DEGREE

Cascade= > TRUE

);

END

/ the above example collects statistics for the SCOTT.DEPT table. One of the parameters worth watching is method_opt. This parameter controls whether the histogram information for the column is collected. Usually, histograms are not collected, and histograms are not clear in a few words. Its four options method_opt= > 'for all columns size skewonly' ORACLE collect histograms based on data distribution

Method_opt= > 'for all columns size repeat' collects histogram information only if it has been collected before, so we usually set method_opt to repeat

Method_opt= > 'for all columns size auto' ORACLE determines whether to collect histograms based on data distribution and column workload

Method_opt= > 'for all columns size interger' We specify a bucket value ourselves

Example 2 collects statistics for a schma

BEGIN

DBMS_STATS.GATHER_SCHEMA_STATS (ownname = > 'SCOTT'

Estimate_percent = > DBMS_STATS.AUTO_SAMPLE_SIZE

Ptions = > 'gather auto'

Degree = > DBMS_STATS.AUTO_DEGREE

Method_opt = > 'for all columns size repeat'

Cascade = > TRUE

);

END

The above example collects statistics for all objects in SCOTT mode. One of the parameters worth noting is options. As mentioned earlier, he has something to do with watch monitoring. It has four options.

Options = > 'gather' collects statistics for all objects

Options = > 'gather empty' only collects tables that have not been counted yet

Options = > 'gather stale' only collects tables with more than 10% modifications

Options = > 'gather auto' is equivalent to empty+stale, so we usually set it to AUTO.

Example 3 collects statistics on a partition table

BEGIN

DBMS_STATS.GATHER_TABLE_STATS (ownname = > 'ROBINSON'

Tabname = > 'playing test'

Estimate_percent = > DBMS_STATS.AUTO_SAMPLE_SIZE

Method_opt = > 'for all columns size repeat'

Degree = > DBMS_STATS.AUTO_DEGREE, granularity = > 'ALL'

Cascade= > TRUE

);

END

/ the above example collects statistics for the ROBINSON.P_ test table. One parameter worth noting is granularity, which has seven options.

Granularity = > 'ALL' collects statistics for partitions, subpartitions, and global

Granularity = > 'AUTO' this is the default setting. ORACLE will decide whether to use ALL,GLOBAL AND PARTITION or something else based on the partition type.

Granularity = > 'DEFAULT' this is out of date

Granularity = > 'GLOBAL' collects global statistics

Granularity = > 'GLOBAL AND PARTITION' collects global, partition statistics, but does not collect subpartition statistics

Granularity = > 'PARTITION' collects partition statistics

Granularity = > 'SUBPARTITION' collects subpartition statistics

Of course, we can specify partname to control which partition to collect statistics for.

9. List the scripts that need to collect statistics for the table

Ordinary watch

Set serveroutput on declare-select OVER THE Change RATE TABLES- cursor overchangerate is select a.table_owner, a.table_name, a.insertsOLAPSYS',. Deletes, b.num_rows from dba_tab_modifications a, dba_tables b where a.table_name = b.table_name and table_owner not in ('SYS',' SYSTEM', 'SYSMAN',' DMSYS', 'OLAPSYS',' XDB', 'EXFSYS' 'CTXSYS',' WMSYS', 'DBSNMP',' ORDSYS', 'OUTLN',' TSMSYS', 'MDSYS') and inserts > 0 and partitioned='NO' and a.inserts/decode (b. Numwoods or a.table_name) > = 0.1 or a.table_name = b.table_name and table_owner not in (' SYS', 'SYSTEM',' SYSMAN', 'DMSYS',' OLAPSYS', 'XDB',' EXFSYS', 'CTXSYS',' WMSYS' 'DBSNMP',' ORDSYS', 'OUTLN',' TSMSYS', 'MDSYS') and updates > 0 and partitioned='NO' and a.updates/decode (b.nummmrows0pr.) > = or a.table_name = b.table_name and table_owner not in (' SYS', 'SYSTEM',' SYSMAN', 'DMSYS',' OLAPSYS', 'XDB',' EXFSYS', 'CTXSYS',' WMSYS', 'DBSNMP',' ORDSYS' 'OUTLN',' TSMSYS', 'MDSYS') and deletes > 0 and partitioned='NO' and a.deletes/decode -select the unanalyzed table- cursor nullmonitor is select owner, table_name from dba_tables where owner not in ('SYS',' SYSTEM', 'SYSMAN',' DMSYS', 'OLAPSYS',' XDB', 'EXFSYS',' CTXSYS', 'WMSYS',' DBSNMP', 'ORDSYS',' OUTLN', 'TSMSYS',' MDSYS') and last_analyzed is null; begin dbms_output.enable (1000000) -flush the monitorring information into the dba_tab_modifications DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO -display the unanalyzed table- dbms_output.put_line ('- -') Dbms_output.put_line ('Unalalyzed tables:'); for v_null in nullmonitor loop dbms_output.put_line (v_null.owner | |'. | | v_null.table_name | | 'has not been analyzed, consider gathering statistics'); end loop -display the information- dbms_output.put_line ('- -') Dbms_output.put_line ('Over the Change_Rate 10% v_topinsert.table_owner'); for v_topinsert in overchangerate loop dbms_output.put_line (v_topinsert.table_owner | |'. | | v_topinsert.table_name | | 'once has' | | v_topinsert.num_rows | | 'rows,' | | 'till now inserted' | | v_topinsert.inserts | | 'rows, updated' | | v_topinsert.updates | | 'rows, deleted' | | v_topinsert.deletes | | 'rows. Consider gathering statistics'); end loop Dbms_output.put_line ('-- -') End; / below is the partition table

Set serveroutput on declare-select OVER THE Change RATE TABLES- cursor overchangerate is select a.tableplayowner.tableplayname num_rows from dba_tab_modifications a.partitionplayname num_rows from dba_tab_modifications a (a.inserts) inserts,sum (a.updates) updates,sum (a.deletes) deletes,sum (b.num_rows) Dba_tab_partitions b where a.table_owner = b.table_owner and a.table_name=b.table_name and a.partition_name=b.partition_name and a.table_owner not in ('SYS',' SYSTEM',' SYSMAN', 'DMSYS',' OLAPSYS', 'XDB',' EXFSYS', 'CTXSYS',' WMSYS', 'DBSNMP',' ORDSYS','OUTLN', 'TSMSYS',' MDSYS') group by a.tableroomowner.tableroomname A.partition_name having (sum (a.inserts) / decode (sum (b.num_rows), 0sum (b.num_rows)) > = 0.1or (sum (a.updates) / decode (sum (b.num_rows), 0mai1m (b.num_rows)) > = 0.1or (sum (a.deletes) / decode (sum (b.num_rows), 0Jing 1m sum (b.num_rows) > = 0.1order by a.table_name Begin dbms_output.enable (1000000);-flush the monitorring information into the dba_tab_modifications DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO -display the top_n_insert information- dbms_output.put_line ('-- -') Dbms_output.put_line ('Over the Change_Rate 10% v_topinsert.table_owner'); for v_topinsert in overchangerate loop dbms_output.put_line (v_topinsert.table_owner | |'. | | v_topinsert.table_name | | 'partition' | | v_topinsert.partition_name | | 'once has' | | v_topinsert.num_rows | | 'rows,' | | 'till now inserted' | | v_topinsert.inserts | | 'rows, updated' | | v_topinsert.updates | | 'rows, deleted' | | v_topinsert.deletes | | 'rows. Consider gathering statistics'); end loop Dbms_output.put_line ('-- -') End; / specifically states that there is a related BUG Bug 9272549-User statistics are ignored when dynamic sampling occurs 9272549.8 in the oracle11.2 version that BUG can cause serious performance problems. Oracle officially stated that this issue was resolved only in version 12.1, and the temporary solution was to turn off dynamic sampling manually. By the way, an introduction to dynamic sampling of 10 level is posted.

Level 0: Do not use dynamic sampling.

Level 1: Sample all tables that have not been analyzed if the following criteria are met: (1) there is at least 1 unanalyzed table in the query; (2) this unanalyzed table is joined to another table or appears in a subquery or non-mergeable view; (3) this unanalyzed table has no indexes; (4) this unanalyzed table has more blocks than the number of blocks that would be used for dynamic sampling of this table. The number of blocks sampled is the default number of dynamic sampling blocks (32)

Level 2: Apply dynamic sampling to all unanalyzed tables. The number of blocks sampled is two times the default number of dynamic sampling blocks.

Level 3: Apply dynamic sampling to all tables that meet Level 2 criteria, plus all tables for which standard selectivity estimation used a guess for some predicate that is a potential dynamic sampling predicate. The number of blocks sampled is the default number of dynamic sampling blocks. For unanalyzed tables, the number of blocks sampled is two times the default number of dynamic sampling blocks.

Level 4: Apply dynamic sampling to all tables that meet Level 3 criteria, plus all tables that have single-table predicates that reference 2 or more columns. The number of blocks sampled is the default number of dynamic sampling blocks. For unanalyzed tables, the number of blocks sampled is two times the default number of dynamic sampling blocks.

Levels 5, 6, 7, 8, and 9: Apply dynamic sampling to all tables that meet the previous level criteria using 2, 4, 8, 32, or 128 times the default number of dynamic sampling blocks respectively.

Level 10: Apply dynamic sampling to all tables that meet the Level 9 criteria using all blocks in the table.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.