In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
The main content of this article is "an analysis of the slow execution efficiency of SQL caused by table variables". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "an analysis of the slow execution efficiency of SQL caused by table variables".
Scene
In recent work, it is found that a synchronous JOB often throws the problem of SQL execution timeout during execution. Looking at the log, it is found that the time of each SQL execution increases linearly, and the execution time is even more than 5 minutes after 50 cycles.
Analysis of JOB execution process
First, analyze the JOB process to see if it is a problem with the JOB design
Through the analysis of the process, it is found that there are only 10, 000 pieces of data that need to be synchronized each time, and there is no problem of timeout caused by big data writing.
Then, after analyzing the process of obtaining detailed information, it is found that the largest amount of data in the associated tables has reached hundreds of millions, which may be the reason for the slow execution of the overall SQL. This can be regarded as the first doubtful point.
Then moving on to the next process, when comparing the duplicate data with Table B, there will be more and more data in Table B as the loop executes, so is it possible that this is the main reason for the linear increase in execution time under loop execution?
Troubleshoot problems one by one
Before, we found two suspicious points by analyzing the execution process of JOB, so now we analyze the problems of SQL in detail.
CREATE TABLE # TableTemp (An int null, B int null, C int null) INSERT INTO # TableTemp (field A, field B) SELECT a. Field A, field B FROM ServerA.dbo.TableB a WITH (NOLOCK) LEFT JOIN dbo.TableA b WITH (NOLOCK) a.Id = b.Id UPDATE a SET a. Field C = b. Field D FROM # TableTemp a LEFT JOIN dbo.TableC b WITH (NOLOCK) ON a. Field A = b.id INSERT INTO dbo. Target TableA (field A, field B) SELECT field A, field B FROM # TableTemp WITH (NOLOCK) INSERT INTO dbo. Target TableB (field A, field B, field C) SELECT DISTINCT a. Field A, a. Field B, a. Field C FROM # TableTemp a WITH (NOLOCK) LEFT JOIN dbo. Target TableB b ON a. Field A = b. Field An AND a. Field B = b. Field B WHERE a.PK IS NULL
Let's first check to see if there is something wrong with point one. Because the table TableC data is already hundreds of millions of data, but the SQL is alone to perform discovery, because the existence of the index discovery execution is not particularly slow, so the problem can be eliminated
So let's take a look at suspicious point two.
INSERT INTO dbo. Target TableB (field A, field B, field C) SELECT DISTINCT a. Field A, a. Field B, a. Field C FROM # TableTemp a WITH (NOLOCK) LEFT JOIN dbo. Target TableB b ON a. Field A = b. Field An AND a. Field B = b. Field B WHERE a.PK IS NULL
You can see that while the SQL is inserted, it also queries whether it has the same data under the condition of its existence. Looking at the table target TableB, it is found that the table has no primary key and no index. Through the SQL analysis provided by DBA, it is found that this SQL is for dbo. The target TableB carries out a full table scan, coupled with the insertion of 1W pieces of data, which is equivalent to for dbo. The target TableB has scanned the whole table for 1 week, and as the loop executes more and more data of the table, the execution time becomes longer and longer. It seems that this is the main reason for the linear increase in execution time.
Solve the problem
According to the elimination of the above problem, we already know that the crux of the problem is to conduct one week of full table scans, resulting in the SQL execution time is too long, then the key to solve the problem is to avoid so many full table scans. Then the most direct solution is to set up an index to avoid full table scanning.
1. By using temporary tables instead of table variables
First, let's take a look at the difference between table variables and temporary tables. We can see that table variables cannot use indexes, so if we use indexes to avoid full table scans, we have to replace drop table variables, and then we create indexes on field An of temporary tables.
two。 Modify the write logic of the target TableB
The existing write logic will first determine whether it exists in the target TableB, and if it does not exist, it will be written into the table. In the case of keeping the business, we will modify the logic slightly, then exclude the data in the target TableB before writing, and write the remaining data into the table, so that we can avoid circulating the target TableB table query for 1W times.
After these two modifications, the execution of the JOB found that the problem was solved perfectly.
At this point, I believe you have a deeper understanding of "an analysis of SQL execution efficiency slowed down by table variables". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un