In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Statistical information
Overview of Statistical Information
Statistical information is mainly a kind of information that describes the size, scale and data distribution of tables and indexes in the database. For example, the number of rows in the table, the number of blocks, the average size of each row, the leaf blocks of the index, the number of rows in the index field, the size of different values, and so on, are all statistical information. Based on these statistical data, CBO calculates the cost of various plans under different access paths and different join methods, and finally selects the plan with the lowest cost.
Under the condition of CBO (cost-based optimizer mode), the execution plan of SQL statements is determined by statistical information. If there is no statistical information, the execution plan is determined by dynamic sampling! It can be said that statistics are related to whether the execution plan of sql is correct, which belongs to the guiding ideology of sql execution. The initialization parameter statistics_level of oracle controls the level at which statistics are collected. There are three parameter values:
BASIC:-collect basic statistics
TYPICAL:-- collects most of the statistics (default settings for the database)
ALL:-- collects all statistics
After Oracle 10g, Query Optimizer has adopted CBO as the default optimizer, and Oracle officially no longer supports RBO services. However, through the optimizer parameter optimizer_mode, we can control the Oracle optimizer to generate execution plans in different modes.
How to collect the contents of statistics:
1) Table statistics
Number of rows-number of Lin
Number of blocks-number of block
Average row length-the average length of a row.
2) Column statistics
Number of distinct values (NDV) in column-the value of distinct in the column
Number of nulls in column-the value of null in the column
Data distribution (histogram)-data distribution
3) Index statistics
Number of leaf blocks-number of blocks of child nod
Levels-number of child nodes
Clustering factor-cluster factor
4) System statistics
I performance and utilization-IO performance and utilization
CPU performance and utilization-performance and utilization of CPU
The way statistics are collected
The collection of Oracle Statistic can be collected using either the analyze command or the DBMS_STATS package. Oracle recommends using the DBMS_STATS package to collect statistics because the DBMS_STATS package is more extensive and more accurate. Analyze may be removed in future releases.
Several common procedures for DBMS_STATS are as follows:
Dbms_stats.gather_table_stats collects statistics for tables, columns, and indexes
Dbms_stats.gather_schema_stats collects statistics for all objects under SCHEMA
Dbms_stats.gather_index_stats collects statistics for the index
Dbms_stats.gather_system_stats collects system statistics
Dbms_stats.GATHER_DICTIONARY_STATS: statistics of all dictionary objects
DBMS_STATS.GATHER_DICTIONARY_STATS collects statistics for all system patterns
Dbms_stats.delete_table_stats delete table statistics
Statistics for dbms_stats.delete_index_stats delete index
Statistics for the dbms_stats.export_table_stats output table
Dbms_stats.create_state_table
Statistics of dbms_stats.set_table_stats settings table
Dbms_stats.auto_sample_size
The syntax of the analyze command is:
Analyze table tablename compute statistics
Analyze table tablename compute statistics for all indexes
Analyze table tablename delete statistics
2.3 automatically collect statistics
There are two kinds of collection of Statistic information in Oracle: automatic collection and manual collection.
The working principle of automatic collection
Oracle's Automatic Statistics Gathering is collected and maintained through Scheduler. The name of the Job is GATHER_STATS_JOB, and the Job collects two kinds of statistics for all objects in the database:
(1) Missing statistics (missing statistics)
(2) Stale statistics (obsolete statistics)
The Job is created automatically when the database is created and is managed by Scheduler. Scheduler runs gather job when maintenance windows open. By default, 11g automatically collects statistics for objects through a job (gather_stat_job) from Monday to Friday (22: 00 p.m.-2:00 the next day). This way of automatically collecting statistics is not to collect statistics for all objects, but to collect objects that do not have statistics and objects whose statistics are too old. Then prioritize and start the statistics.
The stop_on_window_close property of Scheduler Job controls whether GATHER_STATS_JOB continues. The default value of this property is True. If this value is set to False, GATHER_STATS_JOB will be interrupted, and objects that have not been collected will continue to be collected the next time they are started.
Gather_stats_job calls the dbms_stats.gather_database_stats_job_proc procedure to collect information about statistics. The conditions for this process to collect object statistics are as follows:
(1) the statistics of the object have not been collected before.
(2) when more than 10% of the rows of an object is modified, the statistical information of the object is also called stale statistics.
But for highly variable tables that are TRUNCATE/DROP and rebuilt during daytime activity or blocks loading objects that exceed 10% of their total size, we can set the statistics on these tables to NULL
You can view the job at the following SQL:
Select job_name, program_name, enabled, stop_on_window_close
From dba_scheduler_jobs
Where job_name = 'gather_stats_job'
Parameter that controls the automatic collection of statistics: STATISTICS_LEVEL
To decide whether to monitor the object, Oracle provides a parameter STATISTICS_LEVEL. Set the initialization parameter STATISTIC_LEVEL
Typical or all, you can automatically collect statistics (the default is TYPICAL, so you can immediately enable automatic statistics collection). The value of the STATISTIC_LEVEL parameter activates GATHER_STATS_JOB. Basic:STATISTICS_LEVEL is set to basic, which not only cannot monitor the table, but also disables the following new 10g features. Table monitoring is activated by default in 10g:
(1) ASH (Active Session History)
(2) ASSM (Automatic Shared Memory Management)
(3) AWR (Automatic Workload Repository)
(4) ADDM (Automatic Database Diagnostic Monitor)
Explanation of the process of automatic collection of statistical information
When the monitoring of the object is started, the information collected from the last statistics, such as inserts,updates,deletes, etc., will be recorded in the user_tab_modifications view.
When the object's data changes, after a few minutes of delay, the information is written to the user_tab_modifications view, and then the dbms_stats.flush_database_monitoring_info process discovers the information and stores it in memory.
When 10% of the monitored object is modified, the gather_database_stats or gather_schema_stats process will collect the stale statistics.
View automatic collection statistics
View the task and status of automatically collecting statistics:
Select client_name,status from dba_autotask_client
Tasks that disable automatic collection of statistics
Exec DBMS_AUTO_TASK_ADMIN.DISABLE (client_name = > 'auto optimizer stats collection',operation = > NULL,window_name = > NULL)
Check again
Enable the task of automatically collecting statistics
Exec DBMS_AUTO_TASK_ADMIN.ENABLE (client_name = > 'auto optimizer stats collection',operation = > NULL,window_name = > NULL)
Get the execution time of the current automatic collection of statistics
Col WINDOW_NAME format a20
Col REPEAT_INTERVAL format a70
Col DURATION format a20
Set line 180
SELECT w.window_name, w.repeat_interval,w.duration, w.enabled
FROM dba_autotask_window_clients c, dba_scheduler_windows w
WHERE c.window_name = w.window_name AND c.optimizer_stats = 'ENABLED'
The parameters of the above figure are explained below.
The FREQ keyword is used to specify the time period of the interval. Optional parameters are: YEARLY (year), MONTHLY (month), WEEKLY (week), DAILY (day), HOURLY (hour), MINUTELY (minute), SECONDLY (second) and other units.
The INTERVAL keyword is used to specify the frequency of intervals, with values ranging from 1 to 999.
BYHOUR specifies the hour of the day. The range of values that can be specified is 1-24. 16, 17, 18 means 4, 5, 6 o'clock in the afternoon every day.
The BYDAY keyword is used to specify which day of the week to run.
The BYMONTHDAY keyword is used to specify which day of the month. -1 means the last day of each month.
The BYMONTH keyword is used to specify the month of each year.
BYDATE specifies the date. 0310 means March 10th.
Https://blog.csdn.net/zq9017197/article/details/6985109
-- WINDOW_NAME: task name
-- REPEAT_INTERVAL: the interval between task repetition
-- DURATION: duration
-- enabled for true indicates that these jobs are currently active
View the execution of automatic collection statistics history
SELECT * FROM dba_autotask_client_history
WHERE client_name LIKE'% stats%'
-maintenance window group
Select * from dba_scheduler_window_groups
-- maintenance window group corresponding window
Select * from dba_scheduler_wingroup_members
-- maintain window history information
Select* from dba_scheduler_windows
-- query the job that the automatic collection task is executing
Select * from DBA_AUTOTASK_CLIENT_JOB
-- query the execution status of automatic collection task history
Select * from DBA_AUTOTASK_JOB_HISTORY
Select * from DBA_AUTOTASK_CLIENT_HISTORY
Modify the time when the statistics are executed
1. Close a single scheduling time window, where we close MONDAY_WINDOW
BEGIN
DBMS_AUTO_TASK_ADMIN.disable (
Client_name = > 'auto optimizer stats collection'
Operation = > NULL
Window_name = > 'MONDAY_WINDOW')
END
/
-- verify the shutdown, as follows. Optimizer_stats is listed as DISABLED.
SELECT window_name,window_next_time,window_active,optimizer_stats
FROM dba_autotask_window_clients
WHERE window_name = 'MONDAY_WINDOW'
ORDER BY window_next_time
two。 To open a single scheduling time window and all scheduling windows, you only need to use the enable process
-- Note: window name should be specified individually, such as window_name = > 'MONDAY_WINDOW'
BEGIN
DBMS_AUTO_TASK_ADMIN.enable (
Client_name = > 'auto optimizer stats collection'
Operation = > NULL
Window_name = > 'MONDAY_WINDOW')
END
/
-1. Stop the task
BEGIN
DBMS_SCHEDULER.DISABLE (
Name= >'"SYS". "MONDAY_WINDOW"'
Force= > TRUE)
END
/
Use the following script to verify that it is stopped
SELECT window_name,window_next_time,window_active,optimizer_stats
FROM dba_autotask_window_clients
WHERE window_name = 'FRIDAY_WINDOW'
ORDER BY window_next_time
-- 2. Modify the duration of the task in minutes
BEGIN
DBMS_SCHEDULER.SET_ATTRIBUTE (
Name= >'"SYS". "FRIDAY_WINDOW"'
Attribute= > 'DURATION'
Value= > numtodsinterval (180, 'minute'))
END
/
-3. Start execution time, BYHOUR=2, which means execution starts at 2: 00.
BEGIN
DBMS_SCHEDULER.SET_ATTRIBUTE (
Name= >'"SYS". "FRIDAY_WINDOW"'
Attribute= > 'REPEAT_INTERVAL'
Value= > 'FREQ=WEEKLY;BYDAY=MON;BYHOUR=2;BYMINUTE=0;BYSECOND=0')
END
/
-- 4. Start the mission.
BEGIN
DBMS_SCHEDULER.ENABLE (
Name= >'"SYS". "MONDAY_WINDOW"')
END
/
Modify the time window to a specific time
Modify the time window to a specific time
-- the following example changes the Friday time window time to 23:30 in the evening
BEGIN
DBMS_SCHEDULER.DISABLE (name = >'"SYS". "FRIDAY_WINDOW"', force = > TRUE)
END
BEGIN
DBMS_SCHEDULER.SET_ATTRIBUTE (
Name = >'"SYS". "FRIDAY_WINDOW"'
Attribute = > 'REPEAT_INTERVAL'
VALUE = > 'FREQ=WEEKLY;BYDAY=FRI;BYHOUR=23;BYMINUTE=30;BYSECOND=0')
END
BEGIN
DBMS_SCHEDULER.ENABLE (name = >'"SYS". "FRIDAY_WINDOW"')
END
-- validate changes
SELECT w.window_name
W.repeat_interval
W.duration
W.enabled
FROM dba_autotask_window_clients c, dba_scheduler_windows w
WHERE c.window_name = w.window_name
AND c.optimizer_stats = 'ENABLED'
AND c.window_name = 'FRIDAY_WINDOW'
WINDOW_NAME REPEAT_INTERVAL DURATION ENABL
-
FRIDAY_WINDOW FREQ=WEEKLY;BYDAY=FRI;BYHOUR=23;BYMINUTE=30;BYSECOND=0 + 000 04:00:0 TRUE
7. Add maintenance time window
Suppose we are dealing with modifying Monday's time window
First of all, close the time window on Monday
BEGIN
DBMS_AUTO_TASK_ADMIN.disable (
Client_name = > 'auto optimizer stats collection'
Operation = > NULL
Window_name = > 'MONDAY_WINDOW')
END
/
Next, create a window and set the time scheduling interval
As follows, it will be executed at 5 o'clock every Monday for a duration of 1 hour.
BEGIN
DBMS_SCHEDULER.create_window (
Window_name = > 'STATS_WINDOW'
Resource_plan = > 'DEFAULT_MAINTENANCE_PLAN'
Repeat_interval = > 'freq=daily;byday=MON;byhour=5;byminute=0; bysecond=0'
Duration = > INTERVAL'1' HOUR
Comments = > 'Test window for stats task')
EXCEPTION
WHEN OTHERS
THEN
IF SQLCODE =-27477
THEN
NULL
ELSE
RAISE
END IF
END
BEGIN
DBMS_SCHEDULER.set_attribute ('STATS_WINDOW',' SYSTEM', TRUE)
DBMS_SCHEDULER.set_attribute ('STATS_WINDOW'
'FOLLOW_DEFAULT_TIMEZONE'
TRUE)
EXCEPTION
WHEN OTHERS
THEN
IF SQLCODE =-27477
THEN
NULL
ELSE
RAISE
END IF
END
/
SQL > SELECT window_name, repeat_interval, enabled
2 FROM dba_scheduler_windows
3 WHERE window_name = 'STATS_WINDOW'
WINDOW_NAME REPEAT_INTERVAL ENABL
-
STATS_WINDOW freq=daily;byday=MON;byhour=5;byminute=0; bysecond=0 TRUE
Next add the window STATS_WINDOW to the maintenance window group
BEGIN
DBMS_SCHEDULER.add_window_group_member ('MAINTENANCE_WINDOW_GROUP'
'STATS_WINDOW')
END
/
SQL > SELECT WINDOW_NAME
2 WINDOW_NEXT_TIME
3 WINDOW_ACTIVE
4 OPTIMIZER_STATS
5 FROM DBA_AUTOTASK_WINDOW_CLIENTS
6 WHERE WINDOW_NAME in ('STATS_WINDOW','MONDAY_WINDOW')
7 ORDER BY WINDOW_NEXT_TIME
WINDOW_NAME WINDOW_NEXT_TIME WINDO OPTIMIZE
-
STATS_WINDOW 27-MAR-17 05.00.00.000000 AM PRC FALSE ENABLED-allow
MONDAY_WINDOW 27-MAR-17 10.00.00.000000 PM PRC FALSE DISABLED-currently disabled
Turn off the automatic statistics command, exec DBMS_SCHEDULER.DISABLE ('GATHER_STATS_JOB')
Enable automatic statistics command, exec DBMS_SCHEDULER.ENABLE ('GATHER_STATS_JOB')
Check to see if automatic statistics are on
Oracle 10g: SELECT OWNER,JOB_NAME,ENABLED FROM DBA_SCHEDULER_JOBS WHERE JOB_NAME = 'GATHER_STATS_JOB'
Oracle 11g: select t1.owner, t1.job_name, t1.enabled from dba_scheduler_jobs T1 where t1.job_name = 'BSLN_MAINTAIN_STATS_JOB'
Collect statistics manually
Statistics for the table
Contains the number of rows in the table, number of blocks used, number of empty blocks, block utilization, number of row migrations and links, pctfree,pctused data, average row size:
Select num_rows, blocks, empty_blocks, avg_space, chain_cnt, avg_row_len from user_tables
NUM_ROWS-- the number of records in the table
BLOCKS,-- the number of blocks occupied by data in the table
EMPTY_BLOCKS-- the number of empty blocks in the table
AVG_SPACE,-- average usage space in the data block
CHAIN_CNT,-the number of row joins and row migrations in the table
AVG_ROW_LEN-the average length of each record
Statistics for index columns
Contains the depth of the index (B-Tree level), the number of blocks at the index leaf level, the cluster factor (clustering_factor), and the number of unique values.
Select blevel, leaf_blocks, distinct_keys, avg_leaf_blocks_per_key
Avg_data_blocks_per_key, clustering_factor from user_indexes
BLEVEL,-- the number of layers of the index
LEAF_BLOCKS-- the number of leaf nodes
DISTINCT_KEYS,-the number of unique values
AVG_LEAF_BLOCKS_PER_KEY,-- average number of leaf blocks per KEY
AVG_DATA_BLOCKS_PER_KEY,-- average number of blocks per KEY
CLUSTERING_FACTOR-clustering factor
Statistics for the column
Contains unique number of values, column maximum size, density (selection rate), data distribution (histogram information), number of NUL values
Select num_distinct,low_value, high_value
Density, num_nulls, num_buckets, histogram from user_tab_columns
NUM_DISTINCT,-the number of unique values
LOW_VALUE,-- the minimum value on the column
HIGH_VALUE,-- the maximum value on the column
DENSITY,-- selection factor (density)
NUM_NULLS,-the number of null values
NUM_BUCKETS,-- the number of BUCKET in the histogram
HISTOGRAM-the type of histogram
For the collection of statistical information, I would like to talk about some personal understanding:
1. Statistics are stored in the data dictionary table by default, and only the statistics in the data dictionary affect CBO.
The CREATE_STAT_TABLE process provided by 2.DBMS_STATS is only to generate a user-defined table in a specific format to store statistics, and the statistics in this table will not affect the statistics.
In the process of 3.GATHER series, if the stattab,statid,statown parameter is specified (or may not be specified), the collected statistics are not only updated to the data dictionary, but also stored in the stattab table under the statown user, marked as statid.
In the process of 4.EXPORT and IMPORT series, the stattab,statid,statown parameter cannot be empty, which means exporting the current statistics in the data dictionary to the user-defined table and importing the statistics in the user table into the data dictionary. It is obvious that the import operation here and the above GATHER operation will change the statistics and may cause changes in the execution plan, so be careful.
5. It is necessary to back up the old statistics before each collection of statistics, especially to keep one or more copies of the statistics of the system during the stable period.
6. How often to collect statistical information is important for how to back up and retain statistical information, how to select appropriate sampling, parallelism, histogram setting and so on. It is necessary to design a better statistical information collection strategy.
3. Collection of statistical information and common data dictionaries
3.1 Statistical Information Common data Dictionary
Statistics collect the following data:
(1) the analysis of the table itself: including the number of rows in the table, the number of data blocks, the president and other information.
(2) column analysis: including the repeated number of column values, the null values on the column, and the distribution of data on the column.
(3) Analysis of the index: including the number of leaf blocks of the index, the depth of the index, the aggregation factor of the index and so on.
These statistics are stored in the following data dictionary:
DBA_TABLES
DBA_OBJECT_TABLES
DBA_TAB_STATISTICS
DBA_TAB_COL_STATISTICS
DBA_TAB_HISTOGRAMS
DBA_INDEXES
DBA_IND_STATISTICS
DBA_CLUSTERS
DBA_TAB_PARTITIONS
DBA_TAB_SUBPARTITIONS
DBA_IND_PARTITIONS
DBA_IND_SUBPARTITIONS
DBA_PART_COL_STATISTICS
DBA_PART_HISTOGRAMS
DBA_SUBPART_COL_STATISTICS
DBA_SUBPART_HISTOGRAMS
3.2 Statistics for OLAP and OLTP
In the OLTP system, statistics must be collected and the JOB that comes with the database must be closed and the statistics collection policy must be customized. This is because the data in the OLTP system changes frequently and the business is very complex. Let me give you an example: the original estimate returns 1W rows, which causes the CBO estimate to return 1 row.
Many OLAP systems do not collect, directly use HINT fixed, because the OLAP system I get O pressure, the table is also very large, the collection of statistical information is very resource-consuming and the data of OLAP system tables are changing at any time (because real-time storage), so it is necessary to collect statistics frequently, so it is not realistic for OLAP system to collect statistics.
4. Dynamic sampling of statistical information
After Oracle 10g, if a table is not analyzed, the database will automatically do dynamic sampling analysis on it, so here we use hint to set the level of dynamic sampling to 0, that is, do not use dynamic sampling.
According to different optimizer_dynamic_sampling levels, he has a total of 10 levels, which are as follows:
Level 0: Do not use dynamic sampling.
Level 1: The number of blocks sampled is the default number of dynamic sampling blocks (32).
Level 2: The number of blocks sampled is 2 times the default number of dynamic sampling blocks.
Level 3: The number of blocks sampled is 4 times the default number of dynamic sampling blocks.
Level 4: The number of blocks sampled is 8 times the default number of dynamic sampling blocks.
Level 5: The number of blocks sampled is 16 times the default number of dynamic sampling blocks.
Level 6: The number of blocks sampled is 32 times the default number of dynamic sampling blocks.
Level 7: The number of blocks sampled is 64 times the default number of dynamic sampling blocks.
Level 8: The number of blocks sampled is 128 times the default number of dynamic sampling blocks.
Level 9: The number of blocks sampled is 256 times the default number of dynamic sampling blocks.
Level 10: Read all blocks in the table
Use dynamic sampling method to collect statistics for temporary tables in a system
Select / * + dynamic_sampling (t 0) * / * from t where object_id > 30
Http://www.cnblogs.com/kerrycode/p/3854971.html
Experiment
For the newly created table, oracle dynamically collects information about the table when it is accessed, and then collects it into the data dictionary at 10:00 in the evening.
[html] view plain copy
SQL > set autotrace offSQL > set linesize 1000SQL > drop table t_sample purge;drop table t_sample purge*ERROR at line 1:ORA-00942: table or view does not existSQL > create table t_sample as select * from dba_objects;Table created.SQL > create index idx_t_sample_objid on t_sample (object_id); Index created.
No statistics were found for the newly created table
[html] view plain copy
SQL > select num_rows, blocks, last_analyzed from user_tables where table_name = 'Tunable SAMPLESTION NUMPLESTROWS BLOCKS LAST_ANAL
[html] view plain copy
SQL > set autotrace traceonlySQL > set linesize 1000SQL > select * from t_sample where object_id=20 Execution Plan---Plan hash value: 1453182238Murray- -| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |- -| 0 | SELECT STATEMENT | | 1 | 20 | 2 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | T_SAMPLE | 1 | 207 | 2 (0) | 00:00:01 | | * 2 | INDEX RANGE SCAN | IDX_T_SAMPLE_OBJID | 1 | 1 (0) | 00:00:01 |-- -Predicate Information (identified by operation id):- -2-access ("OBJECT_ID" = 20) Note- dynamic sampling used for this statement (level=2) Statistics -24 recursive calls0 db block gets93 consistent gets1 physical reads0 redo size1608 bytes sent via SQL*Net to client523 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory) 0 sorts (disk) 1 rows processed
-dynamic sampling used for this statement (level=2) represents dynamic sampling, but does not record data dictionaries unless you manually collect statistics for the table.
[html] view plain copy
SQL > select num_rows, blocks, last_analyzed from user_tables where table_name = 'Tunable SAMPLESTION NUMPLESTROWS BLOCKS LAST_ANAL- SQL >
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
Beginfor i in 1.. 10000loopinsert into tcustmer values (I, 'xixixi','','aa'); end loop;commi
© 2024 shulou.com SLNews company. All rights reserved.