In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "what are the hidden dangers of collecting statistical database information". The content of the explanation in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought. Let's study and learn what are the hidden dangers of collecting statistical database information.
Collecting statistics causes SQL to produce hard parsing
In most cases, the incorrect statistical information of the table leads to the incorrect calculation of the execution plan by the optimizer, so the statistical information of the table needs to be corrected.
So that the optimizer can reselect the exact execution plan.
When performing SQL optimization, we analyze whether the execution plan is caused by incorrect statistical information by looking at the execution plan, the statistics of the table and the specific conditions of the table.
If there is a problem, when it is determined that it is a problem of statistical information, we should not collect statistical information blindly, otherwise it will bring hidden dangers to the database.
Collect statistical information and bring hidden dangers to the database:
1. For tables that re-collect statistics, some SQL may need to reparse to generate execution plans.
2. For some SQL of tables that re-collect statistics, there may be situations where the statistics are collected, but the execution plan is worse.
3. Collecting statistical information will require additional resource overhead, which will affect the performance of the database during peak business periods.
Verify with a test
Collecting statistics causes SQL to produce hard parsing.
1. Create a test table
SQL > drop table demo purge;Table dropped.SQL > create table demo as select * from dba_objects;Table created.
2. Create an index on the owner column
SQL > create index idx_owner_demo on demo (owner); Index created.
3. Collect the statistics of the table and collect the histogram information of the owner column:
Begin dbms_stats.gather_table_stats (ownname = > 'DEMO', tabname = >' DEMO', estimate_percent = > 100, method_opt = > 'for columns owner size skewonly', no_invalidate = > false Degree = > 1, cascade = > true) End;/
4. View the execution plan of SQL
Check the data where owner is demo and sys (mainly to view the execution plan under different where conditions):
SELECT (SELECT COUNT (*) FROM DEMO) CNT, OWNER, COUNT (*) FROM DEMOWHERE OWNER IN ('DEMO',' SYS') GROUP BY OWNER CNT OWNER COUNT (*)-87069 DEMO 44 87069 SYS 37815
There are 87096 rows of records in table demo, of which 44 rows are recorded with owner as demo and 37815 rows with owner as sys.
5. In order to test the effect, refresh shared pool (do not use it except for testing)
SQL > alter system flush shared_pool;System altered.
6. View the execution plan of the following SQL:
SQL > set autot traceSQL > select / * demo * / * from demo where owner = 'DEMO' 44 rows selected.Execution Plan---Plan hash value: 3014608035 Murray- -| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |-- -| 0 | SELECT STATEMENT | | 44 | 4312 | 3 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | DEMO | 44 | 4312 | 3 (0) | 00 | : 00:01 | | * 2 | INDEX RANGE SCAN | IDX_OWNER_DEMO | 44 | | 1 (0) | 00:00:01 |- -Predicate Information (identified by operation id):-2-access ("OWNER" = 'DEMO')
As you can see from the execution plan, the index range scan is used and the cost is 3.
The statistics are correct and the owner column collects histogram information, so the optimizer generates the correct execution plan based on the statistics
Since owner='DEMO' has only 44 rows of records, index range scanning has the lowest cost when returning these 44 records.
SQL > select / * sys * / * from demo where owner = 'SYS' 37815 rows selected.Execution Plan---Plan hash value: 4000794843 Murray- -| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |- -| 0 | SELECT STATEMENT | | 37815 | 3619K | 347 (1) | 00:00:05 | | * 1 | TABLE ACCESS FULL | DEMO | 37815 | 3619K | 347 (1) | 00:00:05 |- -Predicate Information (identified by operation id):-1-filter ("OWNER" = 'SYS')
As you can see from the execution plan, a full table scan is used with a cost of 347.
Because owner='SYS' has 37815 rows of records, the cost of using a full table scan is the lowest when returning these 37815 records.
7. View the information of SQL:
SQL > select sql_id,sql_text,child_number,plan_hash_value,parse_calls,loads from v$sql where sql_text like'% / * demo * /%' SQL_ID SQL_TEXT CHILD_NUMBER PLAN_HASH_VALUE PARSE_CALLS LOADS -45skkr08bw1m8 select / * demo * / * from demo where owner = 'DEMO' 0 3014608035 1 1
At this point, the plan_hash_value of the SQL's current execution plan is 3014608035, which is hard parsed once (loads represents the number of hard parses).
8. Update the data in the table, but do not collect statistics:
SQL > update demo set owner = 'DEMO' where object_id
< 60000;59659 rows updatedSQL>Commit;Commit complete
Once again, make a query:
SQL > select / * demo * / * from demo where owner = 'DEMO' 59703 rows selected.Execution Plan---Plan hash value: 3014608035 Murray- -| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |-- -| 0 | SELECT STATEMENT | | 44 | 4312 | 3 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | DEMO | 44 | 4312 | 3 (0) | 00 | : 00:01 | | * 2 | INDEX RANGE SCAN | IDX_OWNER_DEMO | 44 | | 1 (0) | 00:00:01 |- -Predicate Information (identified by operation id):-2-access ("OWNER" = 'DEMO')
As can be seen from the execution plan, the index range scan is used and the cost is 3.
The execution plan at this time is wrong, and the number of rows returned is 59659, which is not suitable for scanning in the index range, but should use a full table scan.
However, because the statistics are not updated, the optimizer still thinks that the data in the table is from the previous statistics, so the previous execution plan is extended.
9. Check the statistics
SELECT OWNER, TABLE_NAME, OBJECT_TYPE, STALE_STATS, TO_CHAR (LAST_ANALYZED, 'yyyy-mm-dd hh34:mi:ss') LAST_ANALYZEDFROM DBA_TAB_STATISTICSWHERE OWNER =' DEMO' AND TABLE_NAME = 'DEMO' OWNER TABLE_NAME OBJECT_TYPE STA LAST_ANALYZED -DEMO DEMO TABLE NO 2020-05-12 10:57:46
At this time, the data change of the table has exceeded 10% of the data volume of the table, and the table demo should be recorded in DBA_TAB_STATISTICS.
And change the value of the STALE_STATS column to yes.
(the value of the STALE_ stats column represents the statistics, yes indicates that the statistics are out of date, and no indicates that the statistics are not expired)
Because the changes in the data of the table are not refreshed in time (the default is once every 15 minutes), the information in the DBA_TAB_STATISTICS view is not updated either.
Using manual refresh of database monitoring
SQL > exec dbms_stats.flush_database_monitoring_info;PL/SQL procedure successfully completed.
Then look at the statistics of the table again:
OWNER TABLE_NAME OBJECT_TYPE STA LAST_ANALYZED -DEMO DEMO TABLE YES 2020-05-12 10:57:46
The value of column STALE_STATS has changed to yes, indicating that the statistics for table demo are out of date and need to be re-collected.
10. Re-collect statistics:
Begin dbms_stats.gather_table_stats (ownname = > 'DEMO', tabname = >' DEMO', estimate_percent = > 100, method_opt = > 'for columns owner size skewonly', no_invalidate = > false Degree = > 1, cascade = > true) End;/
View the execution plan of SQL
Take a look at the data with owner of demo and sys:
SELECT (SELECT COUNT (*) FROM DEMO) CNT, OWNER, COUNT (*) FROM DEMOWHERE OWNER IN ('DEMO',' SYS') GROUP BY OWNER CNT OWNER COUNT (*)-87069 DEMO 59703 87069 SYS 5486
There are 87096 rows of records in table demo, of which 59703 rows are recorded with owner as demo and 5486 rows with owner as sys.
After collecting the statistics, review the execution plan again:
SQL > select / * demo * / * from demo where owner = 'DEMO' 59703 rows selected.Execution Plan---Plan hash value: 4000794843 Murray- -| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |- -| 0 | SELECT STATEMENT | | 59703 | 5713K | 347 (1) | 00:00:05 | | * 1 | TABLE ACCESS FULL | DEMO | 59703 | 5713K | 347 (1) | 00:00:05 |- -Predicate Information (identified by operation id):-1-filter ("OWNER" = 'DEMO')
As you can see from the execution plan, a full table scan is used with a cost of 347.
The execution plan at this time is correct, with 59703 rows returned, and the index should not be used at this time, but a full table scan should be used.
12. View the information of SQL:
SQL > select sql_id,sql_text,child_number,plan_hash_value,parse_calls,loads from v$sql where sql_id = '45skkr08bw1m8' SQL_ID SQL_TEXT CHILD_NUMBER PLAN_HASH_VALUE PARSE_CALLS LOADS -45skkr08bw1m8 select / * demo * / * from demo where owner = 'DEMO' 0 4000794843 1 2
Looking at the information of SQL, it is found that loads has become 2, indicating that a hard parsing has been added, that is, under normal circumstances, if the statistical information of the table is collected, then for some SQL, hard parsing will occur. For the production database, if statistical information is collected blindly, a large number of hard parsing will be generated, putting pressure on the database.
Thank you for your reading. the above is the content of "what are the hidden dangers of collecting statistical database information". After the study of this article, I believe you have a deeper understanding of the hidden dangers of collecting statistical database information. the specific use of the situation also needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.