Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

An example of solving the problem of fast and sometimes slow when the application executes periodically

2025-01-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

When doing Oracle DBA, we often encounter some performance problems. Some performance problems are slow from the beginning, some performance problems are gradually slow, some performance problems are suddenly slow, and some performance problems are sometimes fast and sometimes slow. I don't know which kind of performance problems other colleagues think is easier to deal with. Today I share a performance problem periodic problem sometimes fast and sometimes slow, for summary and reflection. Please do not hesitate to correct any mistakes.

Recently, an application operation and maintenance colleague sent an email asking for assistance, reflecting that there is an application system that is periodically fast and sometimes slow during batch operation, and is very regular. Batch calculations from Monday to Friday nights take a long time, 6-8 hours, while Saturday and Sunday nights take about 15 minutes. Seeing the phenomenon described in the email, experienced drivers may suddenly think of where the problem might be. But for lack of experience, it is difficult for me to solve this problem.

So I pulled a WeChat group and asked about the specific time of running fast and slow in the group. One more thing to mention here, before I took over this case, a colleague also looked into this problem. He did some analysis based on awr, from dbtime to redo log generation, and then to transactions. He also made a line chart and posted a bunch in the group, but he just didn't do any detailed inquiry and analysis. In the end, there was no conclusion.

The truth is that after getting some detailed information from developers and application operators, start with awr. Take the awr report of the fast time and the slow time interval respectively. With my ability, I don't see what's wrong with the overall database load from the indicators. But in the part of top SQL, it is found that the TOP SQL of the two time periods is indeed different. In the slow period, there is a update statement 73320 times, which takes 9869.7s, but in the fast period, the update statement is not shown in the TOP SQL. Is that the problem?

Then ask the developer colleague whether the update statement is a statement executed during the batch. After getting an affirmative answer, I began to wonder if the logic of running batches on weekdays and weekends was different, but the information in awr negated my guess. It turned out that this update sentence also ran on weekends, but not very fast, running 75331 times, taking 24.07s. From the current clues, there is no doubt that the execution efficiency of update statements is the reason why programs are sometimes fast and sometimes slow. But why this phenomenon occurs is still unknown.

Problem sql positioning, the rest of the problem is simple, optimize the sql, each batch is completed in 24s, then the problem will be solved naturally. Now start looking for reasons why sql execution is fast and sometimes slow. Here we introduce a powerful tool, awrsqrpt, which can get the historical execution plan of sql statements recorded in awr. Use awrsqrpt to get the execution plan of sql for two time periods, as shown in the following two figures:

As can be seen from the execution plan, all go to the primary key of the table and choose different methods, in which the average single execution time of INDEX RANGE SCAN is 384.3ms, while the average single execution time of INDEX SKIP SCAN is 0.3ms, which is more than 3000 times different. no wonder the difference in batch execution time is so great. Seeing this, the first solution reflected in my mind is to execute the plan regularly and force INDEX SKIP SCAN to go. But this method treats the symptoms rather than the root of the problem. It can only be used in cases where the cause of the problem cannot be found, or in case of emergency.

Continue to communicate with the developer and learn that the tables in the update statement will be dropped by truncate after batch every day, which is a very important information. Is it because of this truncate operation that the execution efficiency of sql statements varies so much? Let's move on.

As we all know, the execution plan of Oracle is based on the statistics on the table through CBO, and estimates the execution plan that Oracle considers to be the best. That is, both INDEX RANGE SCAN and SKIP SCAN,Oracle think it is the best. Is there something wrong with the Oracle optimizer? Probably not. If the optimizer were so prone to problems, Oracle wouldn't be dominant in commercial databases for so long. Then think of the statistics on the table, and get the historical statistics on the table by querying the view sys.wri$_optstat_tab_history as shown below:

As you can see from the figure above, the statistics on the table are sometimes 0 and sometimes very large. It seems that the statistics cause CBO not to choose the optimal execution plan it should choose when choosing the execution plan.

From the above analysis, the root cause of the problem should be found. After each batch completion, the truncate operation will be done. By default, the database will automatically collect table statistics every day, starting at 22:00 from Monday to Friday and 6:00 on weekends. As a result, the statistics collected by the database are different at different points in time, which in turn causes the optimizer to choose a slow execution plan based on the statistics.

When the reason is found, we begin to discuss the solution, and here are several ways to solve the problem:

1. After importing the data in batches, collect the statistical information of the batch scale.

2. Statistical information of lock batch scale when data are available.

3. Change the truncate operation to pre-batch execution (proposed by the developer)

4. Fixed problem sql implementation plan (may not solve the problem)

Reflection:

1. When we solve the problem, we should not only analyze it from the overall level of the database. Sometimes it may be a performance problem caused by a sql whose performance is not very poor.

2. more communication, detailed communication, and mastering as many information points as possible will help to solve the problem.

3. The cause of the problem can also be guessed from the slow-performing sql. The cost evaluated by Oracle is 0.

The above is a solution process of the whole case, welcome to correct.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report