In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
The statistics in the Oracle database are a set of data that are stored in the data dictionary and describe the details of the objects in the Oracle database from multiple dimensions. CBO uses these statistics to calculate the cost of various possible and different execution paths of the target SQL, and selects the execution path with the lowest cost as the execution plan of the target SQL.
The statistics in the Oracle database can be divided into the following six types:
Statistics for the table
Statistics for the index
Statistics for the column
System statistics
Data dictionary statistics
Internal object statistics
The statistics of the table are used to describe the details of the table in the Oracle database. It contains some typical dimensions, such as the number of records, the number of table blocks (data blocks in the table), the average row length, and so on.
The statistics of the index describe the details of the index in the Oracle database, which contains some typical dimensions, such as the level of the index, the number of leaf blocks, clustering factors and so on.
Column statistics describe the details of the columns in the Oracle database, including some typical dimensions, such as the number of distinct values of the column, the number of null values of the column, the minimum value of the column, the maximum value of the column, and the histogram.
The system statistics describe the processing power of the database server where the Oracle database is located, which includes two dimensions: CPU and Ipico. With the help of system statistics, Oracle can know more clearly the actual processing capacity of the target database server.
Data dictionary statistics are used for data dictionary base tables (such as TAB$, IND$, etc.), indexes on data dictionary base tables, and column details of these data dictionaries in hot core Oracle databases. There is no essential difference between the statistics describing the above data dictionary base tables and those describing ordinary tables, indexes and columns.
Internal object statistics are used to describe the details of some internal tables in the Oracle database (such as X$ series tables). Its dimensions are similar to those of regular tables, except that the number of table blocks is 0, because X$ system tables are actually just Oracle custom memory structures and do not take up actual physical storage space.
1. Collect statistical information
In Oracle databases, there are usually two ways to collect statistics: one is to use the ANALYZE command, and the other is to use the DBMS_STATS package. Statistics for tables, indexes, columns, and data dictionaries can be collected using ANALYZE commands or DBMS_STATS packages, but system statistics and internal object statistics can only be collected using DBMS_STATS packages.
If you use the ANALYZE command to collect statistics on the internal table of the system, an error ORA-02030 will be reported.
1.1 collect statistics with the ANALYZE command
Starting with Oracle7, the ANALYZE command can be used to collect statistics for tables, indexes, columns, and system statistics.
Typical uses are as follows:
Zx@ORCL > create table T2 as select * from dba_objects;Table created.zx@ORCL > create index idx_t2 on T2 (object_id); Index created.zx@ORCL > analyze index idx_t2 delete statistics;Index analyzed.
Starting from Oracle 10g, how will Oracle collect the statistics of the target index after the index is created? for the purpose of demonstration, delete the statistics of index IDX_T2:
Execute the sosi script, and from the output, you can see that table T2, the columns of table T2 and the index IDX_T2 have no relevant statistical information.
Zx@ORCL > select count (*) from T2; COUNT (*)-86852
Only statistics are collected for table T2, and in estimation mode, the sampling ratio is 15%:
Zx@ORCL > analyze table T2 estimate statistics sample 15 percent for table;Table analyzed.
When you execute the sosi script again, you can see that only table T2 has statistics, and there are no relevant statistics for the columns and index IDX_T2 of table T2. And because the estimation model is used, the estimated results do not necessarily match the actual results, for example, the actual number of Table T2 is not consistent with the estimated quantity.
Collect statistics on table T2 only, and in calculation mode:
Zx@ORCL > analyze table T2 compute statistics for table;Table analyzed.
When you execute the sosi script again, you can see that only table T2 has statistics, and there are no relevant statistics for the columns and index IDX_T2 of table T2. And because the computing mode is used, the computational mode scans all the data of the target object, so the statistical results match the actual results.
After collecting statistics on table T2, now collect statistics in computational mode on the columns OBJECT_NAME and OBJECT_ID of table T2:
Zx@ORCL > analyze table T2 compute statistics for columns object_name,object_id;Table analyzed.
When you execute the sosi script again, you can see that the columns OBJECT_NAME and OBJECT_ID do have statistics.
Note: in Cui Hua's book "SQL Optimization based on Oracle", it is mentioned that the original statistics of T2 have been erased, that is to say, for the same object, the newly executed ANALYZE command will erase the previous ANALYZE results. But the result of my actual implementation is that the original statistics of Table T2 have not been erased. The environments I use are 10.2.0.4 and 11.2.0.4, and there is no 11.2.0.1 environment for the time being.
You can use the following command to collect statistics on table T2 and columns OBJECT_NAME and OBJECT_ID at the same time in calculation mode:
Zx@ORCL > analyze table T2 compute statistics for table for columns object_name,object_id;Table analyzed.
Execute the sosi script again, and you can see that there are statistics on table T2 and columns OBJECT_NAME and OBJECT_ID.
Use the following command to collect statistics for index IDX_T2 in calculation mode
Zx@ORCL > analyze index idx_t2 compute statistics;Index analyzed.
Execute the sosi script again, and you can see from the output that the index IDX_T2 now has statistics, and the previously collected statistics on table T2 and columns OBJECT_NAME and OBJECT_ID have not been erased, because the ANALYZE command we just executed and the ANALYZE command we executed earlier are not targeted at the same object.
Use the following command to delete statistics for table T2, all columns of table T2, and all indexes of table T2:
Zx@ORCL > analyze table T2 delete statistics;Table analyzed.
Execute the sosi script again, and you can see from the output that the statistics just collected for table T2, the columns OBJECT_NAME, OBJECT_ID, and index IDX_T2 of table T2 have been deleted.
If you want to collect statistics for table T2, all columns of table T2, and all indexes on table T2 at once in computational mode, execute the following statement:
Zx@ORCL > analyze table T2 compute statistics;Table analyzed.
Execute the sosi script again, and you can see from the output that you now have statistics for table T2, all the columns of table T2, and index IDX_T2.
1.2 collect statistics with DBMS_STATS packages
Since Oracle 8.1.5, DBMS_STATS packages have been widely used to collect statistics, and collecting statistics with DMBS_STATS packages is also officially recommended by Oracle. In terms of collecting the statistics required by CBO, the DBMS_STATS package can be simply understood as an additional version of the ANALYZE command.
The following four stored procedures are most commonly used in the DBMS_STATS package:
GATHER_TABLE_STATS: used to collect statistics on the target table, columns of the target table, and indexes on the target table.
GATHER_INDEX_STATS: used to collect statistics for the specified index.
GATHER_SCHEMA_STATS: used to collect statistics for all objects under the specified schema.
GATHER_DATABASE_STATS: used to collect statistics for all objects in the library.
Now let's introduce the common use of the DBMS_STATS package when collecting statistics, or for the above test table T2, where the DBMS_STATS package is used to achieve exactly the same effect as the ANALYZE command.
Delete all statistics on table T2 first
Analyze table t2 delete statistics
Statistics are collected only for Table T2, and in estimation mode, the same proportion is 15%:
Zx@ORCL > exec dbms_stats.gather_table_stats (ownname= > 'ZX',tabname= >' T _ 2), PL/SQL procedure successfully completed.
Execute the sosi script, and you can see from the output that only table T2 has statistics, and there are no statistics for the columns and index IDX_T2 of table T2. And because of the estimation model used, the estimated results do not necessarily match the actual results.
It is important to note that the version of the Oracle database here is 11.2.0.4, and when we call DMBS_STATS.GATHER_TABLE_STATS, we specify the value of the parameter METHOD_OPT as' FOR TABLE', which means that only the statistics of table T2 are collected. This method of collecting table statistics does not apply to all versions of the Oracle database. For example, this method does not apply to Oracle10.2.0.4 and Oracle10.2.0.5, where even if you specify 'FOR TABLE',Oracle, it collects statistics for all columns in addition to collecting table statistics.
If statistics are collected in table T2 and in computational mode, the method implemented with the DBMS_STATS package is to set the sampling ratio of the estimated pattern (that is, the parameter ESTIMATE_PERCENT) to 100% or NULL
Exec dbms_stats.gather_table_stats (ownname= > 'ZX',tabname= >' T _ 2 'FOR TABLE',cascade= > false)
Exec dbms_stats.gather_table_stats (ownname= > 'ZX',tabname= >' T _ 2 'FOR TABLE',cascade= > false)
Zx@ORCL > exec dbms_stats.gather_table_stats (ownname= > 'ZX',tabname= >' T _ 2), PL/SQL procedure successfully completed.
Execute the sosi script, and you can see from the output that there are only statistics for table T2, and there are no relevant statistics for the columns and index IDX_T2 of table T2. And because the computing mode is used, the computational mode scans all the data of the target object, so the statistical results match the actual results.
After collecting statistics on table T2, let's collect statistics in computational mode on the columns OBJECT_NAME and OBJECT_ID of table T2 (no histograms are collected):
Zx@ORCL > exec dbms_stats.gather_table_stats (ownname= > 'ZX',tabname= >' T _ 2 object_name,object_id',cascade= object_name,object_id',cascade= > false); PL/SQL procedure successfully completed.
Execute the sosi script, and you can see from the output that statistics are now available on the columns OBJECT_NAME and OBJECT_ID of table T2, and Oracle also collects statistics on table T2 (note that this is different from the ANALYZE command).
Use the following command to collect statistics for index IDX_T2 in calculation mode
Zx@ORCL > exec dbms_stats.gather_index_stats (ownname= > 'ZX',indname= >' IDX_T2',estimate_percent= > 100); PL/SQL procedure successfully completed.
Execute the sosi script, and you can see from the output that the index IDX_T2 now has statistics.
Use the following command to delete statistics for table T2, all columns of table T2, and all indexes of table T2:
Zx@ORCL > exec dbms_stats.delete_table_stats (ownname= > 'ZX',tabname= >' T2'); PL/SQL procedure successfully completed.
Execute the sosi script, and you can see from the output that the statistics for table T2, all columns of table T2, and all indexes of table T2 have been deleted.
If you want to collect statistics for table T2, all columns of table T2 and all indexes of table T2 at once in computational mode, execute the following statement
Zx@ORCL > exec dbms_stats.gather_table_stats (ownname= > 'ZX',tabname= >' T2 parallel cascade100 > true); PL/SQL procedure successfully completed.
1.3 the difference between ANALYZE and DBMS_STATS
As you can see from the above demonstration, both the ANALYZE command and the DBMS_STATS package can be used to collect statistics for tables, indexes, and columns. It seems that they have exactly the same effect in collecting statistics. Why would Oracle recommend using the DBMS_STATS package to collect statistics?
Because the ANALYZE command has the following defects compared with the DMBS_STATS package:
The ANALYZE command does not correctly collect statistics for partitioned tables, while the DBMS_STATS package does. The ANALYZE command only collects statistics for the lowest-level objects, and then deduces and aggregates higher-level statistics. For example, for a partition table with sub-partitions, it only collects sub-partition statistics first, and then summarizes them to derive partition or table-level statistics. Some statistics can be summarized from the next level of the current object, such as the total number of rows of the table, which can be obtained by adding the number of rows of each partition. However, some statistics are not available from lower-level objects, such as the number of distinct values on the column, NUM_DISTINCT, DESNSITY, and so on.
The ANALYZE command cannot collect statistics in parallel, while the DBMS_STATS package can. Collecting statistics in parallel is a very useful feature for tables with large amounts of data. For tables with a large amount of data, if statistics cannot be collected in parallel, it means that if you want to accurately collect statistics for the target object, it may take a very long time, which may be unacceptable. In the Oracle database, DBMS_STATS package collection statistics can be performed in parallel, which to some extent alleviates a series of problems caused by the long collection of statistics on large tables.
The parallel collection of DBMS_STATS packages is achieved by manually specifying the input parameter DEGREE, such as collecting statistics on table T1 and specifying a parallelism of 4:
Exec dbms_stats.gahter_table_stats (ownname= > 'SCOTT',tabname= >' T1 magic cascade = > true,estimate_percent= > 100fui de degree = > 4)
Of course, the DBMS_STATS package is not perfect, compared with the ANALYZE command, its disadvantage is that the DBMS_STATS package can only collect statistics related to CBO, and some additional information that has nothing to do with CBO, such as the number of row migrations / row links (CHAIN_CNT), the structure of the validation table and index, and so on, the DBMS_STATS package is powerless. The ANALYZE command can be used to analyze and collect the above additional information, such as analyze table xxx list chained rows intoyyy can be used to analyze and collect the number of row migrations / row links, and analyzeindex xxx validate structure can be used to analyze the structure of the index.
2. View statistics
I described how to collect statistics, so how do I view them? The statistics of the Oracle database will be stored in the data dictionary, we only need to query the relevant data dictionary. If we have plenty of time, it's okay to write SQL to query the statistics in the data dictionary, but when we really encounter SQL with performance problems, we usually hope to collect all kinds of statistical information related to the target SQL in the first time, so as to locate the problem in the first time, it is too late to write SQL to query the data dictionary. So we need to prepare a general script for querying statistics in advance, and when something goes wrong, we only need to run the script to get all the statistics of the target object in the first place.
Sosi script (Show OptimizerStatistics Information) is such a script, domestic Oracle database experts have been using this script, it comes from the article on MOS: SCRIPT-Select to show OptimizerStatistics for CBO (document ID 31412.1), the use is very simple, just run the sosi script and specify the name of the table to view statistics. It supports partitioned tables and the display is divided into three parts, namely, table-level statistics, partition-level statistics, and sub-partition-level statistics. The same script was used in the previous experiment.
Attached is the sosi script that can be downloaded and used.
Refer to "SQL Optimization based on Oracle"
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.