Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to analyze the relevant mechanisms and common problems of implementing the plan in SQL Server

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

How to analyze the relevant mechanisms and common problems of implementing the plan in SQL Server? I believe that many inexperienced people are at a loss about this. Therefore, this paper summarizes the causes and solutions of the problems. Through this article, I hope you can solve this problem.

Brief introduction

The SQL statement we usually write is essentially the logic of getting data, not the physical path of getting data. When the SQL statement we write is passed to SQL Server, the query analyzer parses the statement (Parse), binds (Bind), Optimization (sometimes called simplification), and Execution (query optimization). In addition to the execution steps, the execution plan is generated after the first three steps, that is, the SQL Server obtains the physical data according to the plan, and the final execution step executes the query according to the execution plan to get the results. However, the query optimizer is not the focus of this article. The following mainly describes the mechanism and common problems of the query optimizer caching the execution plan after generating the execution plan.

Why do you need to execute the plan cache

We know from the introduction that the process steps for generating the execution plan account for a large proportion, which consumes various CPU and memory resources. In fact, the query optimizer needs to do more work to generate the execution plan, which is roughly divided into three parts:

First of all, according to the incoming query statement text, parse the table name, stored procedure name, view name, and so on. Then a tree representing the query text is generated based on logical data operations.

The second step is optimization and simplification, such as converting subqueries to peer-to-peer connections, giving priority to applying filter conditions, and deleting unnecessary connections (such as having an index and may not need to refer to the original table).

The third step is to make a Cost-based-based assessment based on the statistical information in the database.

Multiple candidate execution plans will not be generated until the above three steps are completed. Although we have only one SQL statement logically, there can be multiple physical data acquisition orders in accordance with this logical order. For example, you want to go from Beijing to Shanghai, that is, you can take a high-speed rail or a plane, but the description from Beijing to Shanghai is a logical description, and there are many ways to achieve it. Let's look at another example in SQL Server, such as the query in listing 1.

SELECT * FROM An INNER JOIN B ON a.a=b.bINNER JOIN C ON c.c=a.a

Listing 1.

For this query, whether A comes first Inner join B or B first Inner Join C, the result is the same, so multiple execution plans can be generated, but a basic principle is that SQL Server does not necessarily choose the best execution plan, but chooses a plan that is good enough, because the cost of evaluating the cost of all execution plans should not be too large. In the end, SQL Server estimates the cost of executing the plan based on the base of the data and the cost of CPU and IO consumed at each step, so the choice of the execution plan depends heavily on statistics, which I won't elaborate on.

For the previous process of generating the execution plan by the query analyzer, it is not difficult to see that the resource cost of this step is also staggering. Therefore, when the same query is executed once, caching it will greatly reduce the compilation of the execution plan, thus improving efficiency, which is the original purpose of the execution plan cache.

Objects cached by the execution plan

The objects cached by the execution plan are divided into four categories, which are:

Compiled plan: the relationship between the compiled execution plan and the execution plan is the same as the relationship between MSIL and C#.

Execution context: there is a context when the compiled plan is executed. Because the compiled plan can be shared by multiple users, but the query needs to store SET information as well as the values of local variables, so the context needs to be associated with the execution plan. The execution context is also known as Executable Plan.

Cursors: the stored cursor state is similar to the relationship between the execution context and the compiled plan. The cursor itself can only be used by a connection, but the execution plan associated with the cursor can be shared by multiple users.

Algebraic trees: algebraic trees (also known as parsing trees) represent query text. As we said earlier, the query parser does not refer directly to the query text, but to the algebraic tree. Here you may have a question, algebraic trees are used to generate execution plans, and algebraic tree trunk hairs are cached here? This is because views, Default, and constraints may be reused by different queries, saving the parsing process by caching the algebraic trees of these objects.

For example, we can find the cached execution plan through the dm_exec_cached_plans DMV, as shown in figure 1.

Figure 1. Cached execution plan

What about the memory occupied by these types of object caches? We can see from the DMV of dm_os_memory_cache_counters that the above types of cached objects are shown in figure 2.

Figure 2. The amount of memory occupied by these types of object caches in memory

In addition, the execution plan cache is a kind of cache. The objects in the cache are replaced according to the algorithm. For the execution plan cache, the algorithm being replaced is mainly based on memory pressure. The memory pressure will be divided into two types, namely internal pressure and external pressure. The external pressure is due to the fact that the free space of the Buffer Pool falls to a critical value (the critical value varies according to the size of the physical memory, or according to the maximum memory if the maximum memory is set). The internal pressure is that the objects in the execution plan cache exceed a certain threshold, such as a 32-bit SQL Server whose threshold is 40000, while the 64-bit threshold is raised to 160000.

The key point here is that the identifier of the cache is the query statement itself, so although select * from SchemaName.TableName and Select * from TableName have the same effect, they need to cache two execution plans, so a Best Practice refers to the name of the table and other objects with the schema name. Tuning statements based on the cached execution plan

The content of the cached execution plan is very rich, including not only the cached execution plan, statements, but also the statistical information of the cached execution plan, such as CPU usage, waiting time, and so on. But it's worth noting that the statistics here count only execution time, not compilation time. For example, we can use the code in listing 2 to find the 20 longest query statements in the database based on the cached execution plan.

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTEDSELECT TOP 20 CAST (qs.total_elapsed_time / 1000000.0 AS DECIMAL (28,2)) AS [Total Duration (s)], CAST (qs.total_worker_time * 100.0 / qs.total_elapsed_time AS DECIMAL (28,2)) AS [% CPU], CAST ((qs.total_elapsed_time-qs.total_worker_time) * 100.0 / qs.total_elapsed_time AS DECIMAL (28) ) AS [% Waiting], qs.execution_count, CAST (qs.total_elapsed_time / 1000000.0 / qs.execution_count AS DECIMAL (28,2)) AS [Average Duration (s)], SUBSTRING (qt.text, (qs.statement_start_offset/2) + 1, (CASE WHEN qs.statement_end_offset =-1 THEN LEN (NVARCHAR (MAX)) Qt.text)) * 2 ELSE qs.statement_end_offset END-qs.statement_start_offset) / 2) + 1) AS [Individual Query, qt.text AS [Parent Query], DB_NAME (qt.dbid) AS DatabaseName Qp.query_plan FROM sys.dm_exec_query_stats qs CROSS APPLY sys.dm_exec_sql_text (qs.sql_handle) as qt CROSS APPLY sys.dm_exec_query_plan (qs.plan_handle) qp WHERE qs.total_elapsed_time > 0 ORDER BY qs.total_elapsed_time DESC

Listing 2. Find the 20 longest query statements in the database by executing the plan cache

The above statement you can modify Order By to find the statement you want to find according to different conditions, so I won't go into details here.

Compared with both server-side Trace and client-side Profiler, this method has certain advantages. If you capture Trace and reanalyze it, it will not only take time and effort, but also bring additional overhead to the server. It will be much easier to find time-consuming query statements by this method. But this statistic is only based on the last time the instance restarted or did not run DBCC FreeProcCache. However, this method also has some disadvantages, such as:

Statements such as index rebuilding and updating statistics are not cached, and these statements can be very expensive. The cache can be replaced at any time, so the method cannot see statements that are no longer in the cache. This statistic can only see the execution cost, not the compilation cost. A cache without parameterization may present a different execution plan for the same statement, so different caches appear, in which case the statistics cannot be accumulated, which may not be very accurate.

The contradiction between execution plan caching and query optimizer

Remember what we said before, the compilation and selection of the execution plan is divided into three steps, in which the first two steps are only based on the metadata of objects such as query statements and tables, and rely heavily on statistics in the stage of selecting the execution plan, so the query optimizer will produce different execution plans just for different parameters. For example, let's look at a simple example, as shown in figure 3.

Figure 3. Simply because of different parameters, the query optimizer chooses a different execution plan

You may think that this is not very good, according to the parameters to produce different implementation plans. Let's consider another problem. If you put the above query into a stored procedure, the parameters cannot be sniffed directly. When the first execution plan is cached, the second execution will reuse the first execution plan! Although the compilation time is eliminated, the cost of a poor execution plan will be higher! Let's look at this example, as shown in figure 4.

Figure 4. Different parameters, but exactly the same implementation plan!

Let's look at the same example and reverse the order of execution, as shown in figure 5.

Figure 5. The implementation plan has completely changed.

We see that the statement executed the second time completely reuses the execution plan of the first time. There will always be a query sacrifice. For example, when the parameter is 4, there will be more than 5000 entries, and the index scan should be the most efficient, but figure 4 reuses the previous execution plan, using more than 5000 lookups! This is undoubtedly inefficient. Moreover, this situation will confuse DBA very much, because the execution plan in the cache is out of control, the objects in the cache may be deleted at any time, and the performance problems caused by who executes first and then often gives DBA a headache.

From this example, we can see that the query optimizer wants to choose the execution plan as efficiently as possible, while the execution plan cache wants to reuse the cache as much as possible, and the two mechanisms conflict in some cases.

After reading the above, have you mastered how to analyze the relevant mechanisms of implementing the plan and the methods of frequently asked questions in SQL Server? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report