Oracle Statistics (1) 04/17 Update SLTechnology News&Howtos

Oracle Statistics (1)

2025-04-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Statistical information

Overview of Statistical Information

Statistical information is mainly a kind of information that describes the size, scale and data distribution of tables and indexes in the database. For example, the number of rows in the table, the number of blocks, the average size of each row, the leaf blocks of the index, the number of rows in the index field, the size of different values, and so on, are all statistical information. Based on these statistical data, CBO calculates the cost of various plans under different access paths and different join methods, and finally selects the plan with the lowest cost.

Under the condition of CBO (cost-based optimizer mode), the execution plan of SQL statements is determined by statistical information. If there is no statistical information, the execution plan is determined by dynamic sampling! It can be said that statistics are related to whether the execution plan of sql is correct, which belongs to the guiding ideology of sql execution. The initialization parameter statistics_level of oracle controls the level at which statistics are collected. There are three parameter values:

BASIC:-collect basic statistics

TYPICAL:-- collects most of the statistics (default settings for the database)

ALL:-- collects all statistics

After Oracle 10g, Query Optimizer has adopted CBO as the default optimizer, and Oracle officially no longer supports RBO services. However, through the optimizer parameter optimizer_mode, we can control the Oracle optimizer to generate execution plans in different modes.

How to collect the contents of statistics:

1) Table statistics

Number of rows-number of Lin

Number of blocks-number of block

Average row length-the average length of a row.

2) Column statistics

Number of distinct values (NDV) in column-the value of distinct in the column

Number of nulls in column-the value of null in the column

Data distribution (histogram)-data distribution

3) Index statistics

Number of leaf blocks-number of blocks of child nod

Levels-number of child nodes

Clustering factor-cluster factor

4) System statistics

I performance and utilization-IO performance and utilization

CPU performance and utilization-performance and utilization of CPU

The way statistics are collected

The collection of Oracle Statistic can be collected using either the analyze command or the DBMS_STATS package. Oracle recommends using the DBMS_STATS package to collect statistics because the DBMS_STATS package is more extensive and more accurate. Analyze may be removed in future releases.

Several common procedures for DBMS_STATS are as follows:

Dbms_stats.gather_table_stats collects statistics for tables, columns, and indexes

Dbms_stats.gather_schema_stats collects statistics for all objects under SCHEMA

Dbms_stats.gather_index_stats collects statistics for the index

Dbms_stats.gather_system_stats collects system statistics

Dbms_stats.GATHER_DICTIONARY_STATS: statistics of all dictionary objects

DBMS_STATS.GATHER_DICTIONARY_STATS collects statistics for all system patterns

Dbms_stats.delete_table_stats delete table statistics

Statistics for dbms_stats.delete_index_stats delete index

Statistics for the dbms_stats.export_table_stats output table

Dbms_stats.create_state_table

Statistics of dbms_stats.set_table_stats settings table

Dbms_stats.auto_sample_size

The syntax of the analyze command is:

Analyze table tablename compute statistics

Analyze table tablename compute statistics for all indexes

Analyze table tablename delete statistics

2.3 automatically collect statistics

There are two kinds of collection of Statistic information in Oracle: automatic collection and manual collection.

The working principle of automatic collection

Oracle's Automatic Statistics Gathering is collected and maintained through Scheduler. The name of the Job is GATHER_STATS_JOB, and the Job collects two kinds of statistics for all objects in the database:

(1) Missing statistics (missing statistics)

(2) Stale statistics (obsolete statistics)

The Job is created automatically when the database is created and is managed by Scheduler. Scheduler runs gather job when maintenance windows open. By default, 11g automatically collects statistics for objects through a job (gather_stat_job) from Monday to Friday (22: 00 p.m.-2:00 the next day). This way of automatically collecting statistics is not to collect statistics for all objects, but to collect objects that do not have statistics and objects whose statistics are too old. Then prioritize and start the statistics.

The stop_on_window_close property of Scheduler Job controls whether GATHER_STATS_JOB continues. The default value of this property is True. If this value is set to False, GATHER_STATS_JOB will be interrupted, and objects that have not been collected will continue to be collected the next time they are started.

Gather_stats_job calls the dbms_stats.gather_database_stats_job_proc procedure to collect information about statistics. The conditions for this process to collect object statistics are as follows:

(1) the statistics of the object have not been collected before.

(2) when more than 10% of the rows of an object is modified, the statistical information of the object is also called stale statistics.

But for highly variable tables that are TRUNCATE/DROP and rebuilt during daytime activity or blocks loading objects that exceed 10% of their total size, we can set the statistics on these tables to NULL

You can view the job at the following SQL:

Select job_name, program_name, enabled, stop_on_window_close

From dba_scheduler_jobs

Where job_name = 'gather_stats_job'

Parameter that controls the automatic collection of statistics: STATISTICS_LEVEL

To decide whether to monitor the object, Oracle provides a parameter STATISTICS_LEVEL. Set the initialization parameter STATISTIC_LEVEL

Typical or all, you can automatically collect statistics (the default is TYPICAL, so you can immediately enable automatic statistics collection). The value of the STATISTIC_LEVEL parameter activates GATHER_STATS_JOB. Basic:STATISTICS_LEVEL is set to basic, which not only cannot monitor the table, but also disables the following new 10g features. Table monitoring is activated by default in 10g:

(1) ASH (Active Session History)

(2) ASSM (Automatic Shared Memory Management)

(3) AWR (Automatic Workload Repository)

(4) ADDM (Automatic Database Diagnostic Monitor)

Explanation of the process of automatic collection of statistical information

When the monitoring of the object is started, the information collected from the last statistics, such as inserts,updates,deletes, etc., will be recorded in the user_tab_modifications view.

When the object's data changes, after a few minutes of delay, the information is written to the user_tab_modifications view, and then the dbms_stats.flush_database_monitoring_info process discovers the information and stores it in memory.

When 10% of the monitored object is modified, the gather_database_stats or gather_schema_stats process will collect the stale statistics.

View automatic collection statistics

View the task and status of automatically collecting statistics:

Select client_name,status from dba_autotask_client

Tasks that disable automatic collection of statistics

Exec DBMS_AUTO_TASK_ADMIN.DISABLE (client_name = > 'auto optimizer stats collection',operation = > NULL,window_name = > NULL)

Check again

Enable the task of automatically collecting statistics

Exec DBMS_AUTO_TASK_ADMIN.ENABLE (client_name = > 'auto optimizer stats collection',operation = > NULL,window_name = > NULL)

Get the execution time of the current automatic collection of statistics

Col WINDOW_NAME format a20

Col REPEAT_INTERVAL format a70

Col DURATION format a20

Set line 180

SELECT w.window_name, w.repeat_interval,w.duration, w.enabled

FROM dba_autotask_window_clients c, dba_scheduler_windows w

WHERE c.window_name = w.window_name AND c.optimizer_stats = 'ENABLED'

The parameters of the above figure are explained below.

The FREQ keyword is used to specify the time period of the interval. Optional parameters are: YEARLY (year), MONTHLY (month), WEEKLY (week), DAILY (day), HOURLY (hour), MINUTELY (minute), SECONDLY (second) and other units.

The INTERVAL keyword is used to specify the frequency of intervals, with values ranging from 1 to 999.

BYHOUR specifies the hour of the day. The range of values that can be specified is 1-24. 16, 17, 18 means 4, 5, 6 o'clock in the afternoon every day.

The BYDAY keyword is used to specify which day of the week to run.

The BYMONTHDAY keyword is used to specify which day of the month. -1 means the last day of each month.

The BYMONTH keyword is used to specify the month of each year.

BYDATE specifies the date. 0310 means March 10th.

Https://blog.csdn.net/zq9017197/article/details/6985109

-- WINDOW_NAME: task name

-- REPEAT_INTERVAL: the interval between task repetition

-- DURATION: duration

-- enabled for true indicates that these jobs are currently active

View the execution of automatic collection statistics history

SELECT * FROM dba_autotask_client_history

WHERE client_name LIKE'% stats%'

-maintenance window group

Select * from dba_scheduler_window_groups

-- maintenance window group corresponding window

Select * from dba_scheduler_wingroup_members

-- maintain window history information

Select* from dba_scheduler_windows

-- query the job that the automatic collection task is executing

Select * from DBA_AUTOTASK_CLIENT_JOB

-- query the execution status of automatic collection task history

Select * from DBA_AUTOTASK_JOB_HISTORY

Select * from DBA_AUTOTASK_CLIENT_HISTORY

Modify the time when the statistics are executed

1. Close a single scheduling time window, where we close MONDAY_WINDOW

BEGIN

DBMS_AUTO_TASK_ADMIN.disable (

Client_name = > 'auto optimizer stats collection'

Operation = > NULL

Window_name = > 'MONDAY_WINDOW')

END

-- verify the shutdown, as follows. Optimizer_stats is listed as DISABLED.

SELECT window_name,window_next_time,window_active,optimizer_stats

FROM dba_autotask_window_clients

WHERE window_name = 'MONDAY_WINDOW'

ORDER BY window_next_time

two。 To open a single scheduling time window and all scheduling windows, you only need to use the enable process

-- Note: window name should be specified individually, such as window_name = > 'MONDAY_WINDOW'

BEGIN

DBMS_AUTO_TASK_ADMIN.enable (

Client_name = > 'auto optimizer stats collection'

Operation = > NULL

Window_name = > 'MONDAY_WINDOW')

END

-1. Stop the task

BEGIN

DBMS_SCHEDULER.DISABLE (

Name= >'"SYS". "MONDAY_WINDOW"'

Force= > TRUE)

END

Use the following script to verify that it is stopped

SELECT window_name,window_next_time,window_active,optimizer_stats

FROM dba_autotask_window_clients

WHERE window_name = 'FRIDAY_WINDOW'

ORDER BY window_next_time

-- 2. Modify the duration of the task in minutes

BEGIN

DBMS_SCHEDULER.SET_ATTRIBUTE (

Name= >'"SYS". "FRIDAY_WINDOW"'

Attribute= > 'DURATION'

Value= > numtodsinterval (180, 'minute'))

END

-3. Start execution time, BYHOUR=2, which means execution starts at 2: 00.

BEGIN

DBMS_SCHEDULER.SET_ATTRIBUTE (

Name= >'"SYS". "FRIDAY_WINDOW"'

Attribute= > 'REPEAT_INTERVAL'

Value= > 'FREQ=WEEKLY;BYDAY=MON;BYHOUR=2;BYMINUTE=0;BYSECOND=0')

END

-- 4. Start the mission.

BEGIN

DBMS_SCHEDULER.ENABLE (

Name= >'"SYS". "MONDAY_WINDOW"')

END

Modify the time window to a specific time

-- the following example changes the Friday time window time to 23:30 in the evening

BEGIN

DBMS_SCHEDULER.DISABLE (name = >'"SYS". "FRIDAY_WINDOW"', force = > TRUE)

END

BEGIN

DBMS_SCHEDULER.SET_ATTRIBUTE (

Name = >'"SYS". "FRIDAY_WINDOW"'

Attribute = > 'REPEAT_INTERVAL'

VALUE = > 'FREQ=WEEKLY;BYDAY=FRI;BYHOUR=23;BYMINUTE=30;BYSECOND=0')

END

BEGIN

DBMS_SCHEDULER.ENABLE (name = >'"SYS". "FRIDAY_WINDOW"')

END

-- validate changes

SELECT w.window_name

W.repeat_interval

W.duration

W.enabled

FROM dba_autotask_window_clients c, dba_scheduler_windows w

WHERE c.window_name = w.window_name

AND c.optimizer_stats = 'ENABLED'

AND c.window_name = 'FRIDAY_WINDOW'

WINDOW_NAME REPEAT_INTERVAL DURATION ENABL

FRIDAY_WINDOW FREQ=WEEKLY;BYDAY=FRI;BYHOUR=23;BYMINUTE=30;BYSECOND=0 + 000 04:00:0 TRUE

7. Add maintenance time window

Suppose we are dealing with modifying Monday's time window

First of all, close the time window on Monday

BEGIN

DBMS_AUTO_TASK_ADMIN.disable (

Client_name = > 'auto optimizer stats collection'

Operation = > NULL

Window_name = > 'MONDAY_WINDOW')

END

Next, create a window and set the time scheduling interval

As follows, it will be executed at 5 o'clock every Monday for a duration of 1 hour.

BEGIN

DBMS_SCHEDULER.create_window (

Window_name = > 'STATS_WINDOW'

Resource_plan = > 'DEFAULT_MAINTENANCE_PLAN'

Repeat_interval = > 'freq=daily;byday=MON;byhour=5;byminute=0; bysecond=0'

Duration = > INTERVAL'1' HOUR

Comments = > 'Test window for stats task')

EXCEPTION

WHEN OTHERS

THEN

IF SQLCODE =-27477

THEN

NULL

ELSE

RAISE

END IF

END

BEGIN

DBMS_SCHEDULER.set_attribute ('STATS_WINDOW',' SYSTEM', TRUE)

DBMS_SCHEDULER.set_attribute ('STATS_WINDOW'

'FOLLOW_DEFAULT_TIMEZONE'

TRUE)

EXCEPTION

WHEN OTHERS

THEN

IF SQLCODE =-27477

THEN

NULL

ELSE

RAISE

END IF

END

SQL > SELECT window_name, repeat_interval, enabled

2 FROM dba_scheduler_windows

3 WHERE window_name = 'STATS_WINDOW'

WINDOW_NAME REPEAT_INTERVAL ENABL

STATS_WINDOW freq=daily;byday=MON;byhour=5;byminute=0; bysecond=0 TRUE

Next add the window STATS_WINDOW to the maintenance window group

BEGIN

DBMS_SCHEDULER.add_window_group_member ('MAINTENANCE_WINDOW_GROUP'

'STATS_WINDOW')

END

SQL > SELECT WINDOW_NAME

2 WINDOW_NEXT_TIME

3 WINDOW_ACTIVE

4 OPTIMIZER_STATS

5 FROM DBA_AUTOTASK_WINDOW_CLIENTS

6 WHERE WINDOW_NAME in ('STATS_WINDOW','MONDAY_WINDOW')

7 ORDER BY WINDOW_NEXT_TIME

WINDOW_NAME WINDOW_NEXT_TIME WINDO OPTIMIZE

STATS_WINDOW 27-MAR-17 05.00.00.000000 AM PRC FALSE ENABLED-allow

MONDAY_WINDOW 27-MAR-17 10.00.00.000000 PM PRC FALSE DISABLED-currently disabled

Turn off the automatic statistics command, exec DBMS_SCHEDULER.DISABLE ('GATHER_STATS_JOB')

Enable automatic statistics command, exec DBMS_SCHEDULER.ENABLE ('GATHER_STATS_JOB')

Check to see if automatic statistics are on

Oracle 10g: SELECT OWNER,JOB_NAME,ENABLED FROM DBA_SCHEDULER_JOBS WHERE JOB_NAME = 'GATHER_STATS_JOB'

Oracle 11g: select t1.owner, t1.job_name, t1.enabled from dba_scheduler_jobs T1 where t1.job_name = 'BSLN_MAINTAIN_STATS_JOB'

Collect statistics manually

Statistics for the table

Contains the number of rows in the table, number of blocks used, number of empty blocks, block utilization, number of row migrations and links, pctfree,pctused data, average row size:

Select num_rows, blocks, empty_blocks, avg_space, chain_cnt, avg_row_len from user_tables

NUM_ROWS-- the number of records in the table

BLOCKS,-- the number of blocks occupied by data in the table

EMPTY_BLOCKS-- the number of empty blocks in the table

AVG_SPACE,-- average usage space in the data block

CHAIN_CNT,-the number of row joins and row migrations in the table

AVG_ROW_LEN-the average length of each record

Statistics for index columns

Contains the depth of the index (B-Tree level), the number of blocks at the index leaf level, the cluster factor (clustering_factor), and the number of unique values.

Select blevel, leaf_blocks, distinct_keys, avg_leaf_blocks_per_key

Avg_data_blocks_per_key, clustering_factor from user_indexes

BLEVEL,-- the number of layers of the index

LEAF_BLOCKS-- the number of leaf nodes

DISTINCT_KEYS,-the number of unique values

AVG_LEAF_BLOCKS_PER_KEY,-- average number of leaf blocks per KEY

AVG_DATA_BLOCKS_PER_KEY,-- average number of blocks per KEY

CLUSTERING_FACTOR-clustering factor

Statistics for the column

Contains unique number of values, column maximum size, density (selection rate), data distribution (histogram information), number of NUL values

Select num_distinct,low_value, high_value

Density, num_nulls, num_buckets, histogram from user_tab_columns

NUM_DISTINCT,-the number of unique values

LOW_VALUE,-- the minimum value on the column

HIGH_VALUE,-- the maximum value on the column

DENSITY,-- selection factor (density)

NUM_NULLS,-the number of null values

NUM_BUCKETS,-- the number of BUCKET in the histogram

HISTOGRAM-the type of histogram

For the collection of statistical information, I would like to talk about some personal understanding:

1. Statistics are stored in the data dictionary table by default, and only the statistics in the data dictionary affect CBO.

The CREATE_STAT_TABLE process provided by 2.DBMS_STATS is only to generate a user-defined table in a specific format to store statistics, and the statistics in this table will not affect the statistics.

In the process of 3.GATHER series, if the stattab,statid,statown parameter is specified (or may not be specified), the collected statistics are not only updated to the data dictionary, but also stored in the stattab table under the statown user, marked as statid.

In the process of 4.EXPORT and IMPORT series, the stattab,statid,statown parameter cannot be empty, which means exporting the current statistics in the data dictionary to the user-defined table and importing the statistics in the user table into the data dictionary. It is obvious that the import operation here and the above GATHER operation will change the statistics and may cause changes in the execution plan, so be careful.

5. It is necessary to back up the old statistics before each collection of statistics, especially to keep one or more copies of the statistics of the system during the stable period.

6. How often to collect statistical information is important for how to back up and retain statistical information, how to select appropriate sampling, parallelism, histogram setting and so on. It is necessary to design a better statistical information collection strategy.

3. Collection of statistical information and common data dictionaries

3.1 Statistical Information Common data Dictionary

Statistics collect the following data:

(1) the analysis of the table itself: including the number of rows in the table, the number of data blocks, the president and other information.

(2) column analysis: including the repeated number of column values, the null values on the column, and the distribution of data on the column.

(3) Analysis of the index: including the number of leaf blocks of the index, the depth of the index, the aggregation factor of the index and so on.

These statistics are stored in the following data dictionary:

DBA_TABLES

DBA_OBJECT_TABLES

DBA_TAB_STATISTICS

DBA_TAB_COL_STATISTICS

DBA_TAB_HISTOGRAMS

DBA_INDEXES

DBA_IND_STATISTICS

DBA_CLUSTERS

DBA_TAB_PARTITIONS

DBA_TAB_SUBPARTITIONS

DBA_IND_PARTITIONS

DBA_IND_SUBPARTITIONS

DBA_PART_COL_STATISTICS

DBA_PART_HISTOGRAMS

DBA_SUBPART_COL_STATISTICS

DBA_SUBPART_HISTOGRAMS

3.2 Statistics for OLAP and OLTP

In the OLTP system, statistics must be collected and the JOB that comes with the database must be closed and the statistics collection policy must be customized. This is because the data in the OLTP system changes frequently and the business is very complex. Let me give you an example: the original estimate returns 1W rows, which causes the CBO estimate to return 1 row.

Many OLAP systems do not collect, directly use HINT fixed, because the OLAP system I get O pressure, the table is also very large, the collection of statistical information is very resource-consuming and the data of OLAP system tables are changing at any time (because real-time storage), so it is necessary to collect statistics frequently, so it is not realistic for OLAP system to collect statistics.

4. Dynamic sampling of statistical information

After Oracle 10g, if a table is not analyzed, the database will automatically do dynamic sampling analysis on it, so here we use hint to set the level of dynamic sampling to 0, that is, do not use dynamic sampling.

According to different optimizer_dynamic_sampling levels, he has a total of 10 levels, which are as follows:

Level 0: Do not use dynamic sampling.

Level 1: The number of blocks sampled is the default number of dynamic sampling blocks (32).

Level 2: The number of blocks sampled is 2 times the default number of dynamic sampling blocks.

Level 3: The number of blocks sampled is 4 times the default number of dynamic sampling blocks.

Level 4: The number of blocks sampled is 8 times the default number of dynamic sampling blocks.

Level 5: The number of blocks sampled is 16 times the default number of dynamic sampling blocks.

Level 6: The number of blocks sampled is 32 times the default number of dynamic sampling blocks.

Level 7: The number of blocks sampled is 64 times the default number of dynamic sampling blocks.

Level 8: The number of blocks sampled is 128 times the default number of dynamic sampling blocks.

Level 9: The number of blocks sampled is 256 times the default number of dynamic sampling blocks.

Level 10: Read all blocks in the table

Use dynamic sampling method to collect statistics for temporary tables in a system

Select / * + dynamic_sampling (t 0) * / * from t where object_id > 30

Http://www.cnblogs.com/kerrycode/p/3854971.html

Experiment

For the newly created table, oracle dynamically collects information about the table when it is accessed, and then collects it into the data dictionary at 10:00 in the evening.

[html] view plain copy

SQL > set autotrace offSQL > set linesize 1000SQL > drop table t_sample purge;drop table t_sample purge*ERROR at line 1:ORA-00942: table or view does not existSQL > create table t_sample as select * from dba_objects;Table created.SQL > create index idx_t_sample_objid on t_sample (object_id); Index created.

No statistics were found for the newly created table

[html] view plain copy

SQL > select num_rows, blocks, last_analyzed from user_tables where table_name = 'Tunable SAMPLESTION NUMPLESTROWS BLOCKS LAST_ANAL

[html] view plain copy

SQL > set autotrace traceonlySQL > set linesize 1000SQL > select * from t_sample where object_id=20 Execution Plan---Plan hash value: 1453182238Murray- -| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |- -| 0 | SELECT STATEMENT | | 1 | 20 | 2 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | T_SAMPLE | 1 | 207 | 2 (0) | 00:00:01 | | * 2 | INDEX RANGE SCAN | IDX_T_SAMPLE_OBJID | 1 | 1 (0) | 00:00:01 |-- -Predicate Information (identified by operation id):- -2-access ("OBJECT_ID" = 20) Note- dynamic sampling used for this statement (level=2) Statistics -24 recursive calls0 db block gets93 consistent gets1 physical reads0 redo size1608 bytes sent via SQL*Net to client523 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory) 0 sorts (disk) 1 rows processed

-dynamic sampling used for this statement (level=2) represents dynamic sampling, but does not record data dictionaries unless you manually collect statistics for the table.

[html] view plain copy

SQL > select num_rows, blocks, last_analyzed from user_tables where table_name = 'Tunable SAMPLESTION NUMPLESTROWS BLOCKS LAST_ANAL- SQL >

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.