In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
Today, I will talk to you about how to analyze the optimization and efficiency of SQL sentences in SQL Server. Many people may not know much about it. In order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.
Many people don't know how SQL statements are executed in SQL SERVER, and they worry that the SQL statements they write will be misunderstood by SQL SERVER. For example:
Select * from table1 where name='zhangsan' and tID > 10000
And execution:
Select * from table1 where tID > 10000 and name='zhangsan'
Some people do not know whether the execution efficiency of the above two statements is the same, because if you simply look at the sequence of the statements, the two statements are indeed different, if tID is an aggregate index, then the latter sentence will only look up the records after 10000 items of the table; while the former sentence should first look for several name='zhangsan' in the whole table, and then propose the query results according to the constraint condition tID > 10000.
In fact, such worries are unnecessary. There is a "query analysis optimizer" in SQL SERVER that calculates the search conditions in the where clause and determines which index reduces the search space for table scans, that is, it optimizes automatically.
Although the query optimizer can automatically optimize queries based on the where clause, it is still necessary to understand how the query optimizer works. Otherwise, sometimes the query optimizer will not query quickly as you intended.
During the query analysis phase, the query optimizer looks at each stage of the query and decides whether it is useful to limit the amount of data that needs to be scanned. If a phase can be used as a scan parameter (SARG), it is called optimizable, and the required data can be quickly obtained using the index.
Definition of SARG: used to restrict an operation of a search, because it usually refers to a specific match, a match in a worthy range, or an AND connection with more than two conditions.
The form is as follows:
Column name operator
Or
Operator column names can appear on one side of the operator, while constants or variables appear on the other side of the operator. Such as:
Name=' Zhang San'
Price > 5000
50005000
If an expression does not satisfy the form of SARG, it cannot limit the scope of the search, that is, SQL SERVER must determine for each line whether it meets all the conditions in the WHERE clause. So an index is useless for expressions that do not satisfy the SARG form.
After introducing SARG, let's summarize the experience of using SARG and the experiences encountered in practice that are different from the conclusions on some materials:
1. Whether the Like statement belongs to SARG depends on the type of wildcard used
For example, name like 'Zhang%', this belongs to SARG
And: name like'% Zhang'--does not belong to SARG.
The reason is that the opening of the wildcard% in the string makes the index unusable.
2. Or will cause full table scan
Name=' Zhang San 'and price > 5000 symbol SARG, while: Name=' Zhang San' or price > 5000 does not conform to SARG. Using or causes a full table scan.
3. Statements that do not satisfy the SARG form caused by non-operators and functions
The most typical case of statements that do not satisfy the SARG form is to include statements that are not operators, such as NOT,! =,!, NOT EXISTS, NOT IN, NOT LIKE, etc., as well as functions.
Here are a few examples that do not satisfy the SARG form:
ABS (Price) 5000
-- SQL SERVER will also think that SARG,SQL SERVER will translate this expression into: WHERE price > 2500 Universe 2
However, this is not recommended because sometimes SQL SERVER cannot guarantee that this transformation is completely equivalent to the original expression.
4. The function of IN is equivalent to that of OR.
Statement:
Select * from table1 where tid in (2pm 3)
-- and
Select * from table1 where tid=2 or tid=3
Is the same, will cause a full table scan, if there is an index on the tid, its index will also be invalid.
5. Use NOT as little as possible
6. The execution efficiency of exists and in is the same.
Many materials show that the execution efficiency of exists is higher than that of in, and not exists should be used instead of not in as much as possible. But in fact, I experimented and found that the execution efficiency between the two is the same regardless of whether they are preceded by not or not. Because subqueries are involved, we try to use the pubs database that comes with SQL SERVER this time. Before running, we can turn on the status of statistics I cando O of SQL SERVER:
(1) select title,price from titles where title_id in (select title_id from sales where qty > 30)
The result of the execution of this sentence is:
Table 'sales'. Scan count 18, logical read 56 times, physical read 0 times, pre-read 0 times. Table 'titles'. Scan count 1, logic read 2 times, physical read 0 times, pre-read 0 times.
(2) select title,price from titles where exists (select * from sales where sales.title_id=titles.title_id and qty > 30)
The execution result of the second sentence is as follows:
Table 'sales'. Scan count 18, logical read 56 times, physical read 0 times, pre-read 0 times. Table 'titles'. Scan count 1, logic read 2 times, physical read 0 times, pre-read 0 times.
From then on, we can see that the execution efficiency with exists is the same as that with in.
7. Using the function charindex () is as efficient as the LIKE with the wildcard% before it.
As we mentioned earlier, if you precede the LIKE with the wildcard%, it will cause a full table scan, so its execution is inefficient. However, according to some data, using the function charindex () instead of LIKE will greatly improve the speed. Through my experiments, I found that this explanation is also wrong:
Select gid,title,fariqi,reader from tgongwen where charindex ('Criminal investigation Detachment', reader) > 0 and fariqi > '2004-5-5'
Time: 7 seconds, in addition: scan count 4, logical read 7155 times, physical read 0 times, pre-read 0 times.
Select gid,title,fariqi,reader from tgongwen where reader like'%'+ 'Criminal investigation Detachment' +% 'and fariqi >' 2004-5-5'
Time: 7 seconds, in addition: scan count 4, logical read 7155 times, physical read 0 times, pre-read 0 times.
8. Union is not always more efficient than or.
We have mentioned earlier that using or in the where clause will cause full table scans. Generally speaking, all the materials I have seen are recommended to use union instead of or. It turns out that this statement is applicable to most of them.
Select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi='2004-9-16'or gid > 9990000
Time: 68 seconds. Scan count 1, logical read 404008 times, physical read 283 times, pre-read 392163 times.
Select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi='2004-9-16'unionselect gid,fariqi,neibuyonghu,reader,title from Tgongwen where gid > 9990000
Time: 9 seconds. Scan count 8, logical read 67489 times, physical read 216times, pre-read 7499 times.
It seems that using union is generally much more efficient than using or.
But after experiments, the author found that if the query columns on both sides of or are the same, then the execution speed of using union is much lower than that of using or, although here union scans the index and or scans the whole table.
Select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi='2004-9-16' or fariqi='2004-2-5'
Time: 6423 milliseconds. Scan count 2, logical read 14726 times, physical read 1 time, pre-read 7176 times.
Select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi='2004-9-16'unionselect gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi='2004-2-5'
Time: 11640 milliseconds. Scan count 8, logical read 14806 times, physical read 108 times, pre-read 1144 times.
9. Field extraction should follow the principle of "how much you need and how much you need" to avoid "select *".
Let's do an experiment:
Select top 10000 gid,fariqi,reader,title from tgongwen order by gid desc time: 4673 Ms select top 10000 gid,fariqi,title from tgongwen order by gid desc time: 1376 Ms select top 10000 gid,fariqi from tgongwen order by gid desc time: 80 Ms
From this point of view, every time we extract one less field, the speed of data extraction will be improved accordingly. The speed of improvement depends on the size of the field you discard.
10. Count (*) is no slower than count (field)
According to some data, using * will count all columns, which is obviously less efficient than listing in a world. In fact, this statement is unfounded. Let's see:
Select count (*) from Tgongwen time: 1500 milliseconds
Select count (gid) from Tgongwen time: 1483 milliseconds
Select count (fariqi) from Tgongwen time: 3140 milliseconds
Select count (title) from Tgongwen time: 52050 milliseconds
As you can see from the above, the speed of using count (*) is the same as using count (primary key), while count (*) is faster than any other field except the primary key, and the longer the field, the slower the summary. I think that if you use count (*), SQL SERVER may automatically find the smallest field to summarize. Of course, it will be more direct if you write count (primary key) directly.
11. Order by is the most efficient to sort by clustered index column
Let's see: (gid is the primary key and fariqi is the aggregate index column):
Select top 10000 gid,fariqi,reader,title from tgongwen
Time: 196 milliseconds. Scan count 1, logic read 289 times, physical read 1 time, pre-read 1527 times.
Select top 10000 gid,fariqi,reader,title from tgongwen order by gid asc
Time: 4720 milliseconds. Scan count 1, logical read 41956 times, physical read 0 times, pre-read 1287 times.
Select top 10000 gid,fariqi,reader,title from tgongwen order by gid desc time: 4736 milliseconds. Scan count 1, logical read 55350 times, physical read 10 times, pre-read 775 times.
Select top 10000 gid,fariqi,reader,title from tgongwen order by fariqi asc time: 173ms. Scan count 1, logic read 290 times, physical read 0 times, pre-read 0 times.
Select top 10000 gid,fariqi,reader,title from tgongwen order by fariqi desc time: 156ms. Scan count 1, logic read 289 times, physical read 0 times, pre-read 0 times.
From the above, we can see that the speed of non-sorting and the number of logical reads are the same as that of "order by clustered index columns", but these are much faster than "order by nonclustered index columns".
At the same time, when sorting by a field, whether it is positive or reverse, the speed is basically the same.
12. Efficient TOP
In fact, when querying and extracting super-large datasets, the biggest factor that affects the response time of the database is not the data lookup, but the physical Imax 0 operation. Such as:
Select top 10 * from (select top 10000 gid,fariqi,title from tgongwenwhere neibuyonghu=' Office 'order by gid desc) as aorder by gid asc
In theory, the execution time of the whole sentence should be longer than that of the clause, but the opposite is true. Because 10000 records are returned after the clause is executed, while only 10 statements are returned by the whole statement, the biggest factor that affects the database response time is the physical Icano operation. One of the most effective ways to restrict physical IDUBO operations here is to use the TOP keyword. The TOP keyword is a systematically optimized word in SQL SERVER that is used to extract the first few items or percentages of data. Through the author's application in practice, it is found that TOP is really easy to use and efficient. But this word does not exist in another large database, ORACLE, which is not a pity, although it can be solved in other ways in ORACLE (such as rownumber).
After reading the above, do you have any further understanding of how to analyze the optimization and efficiency of SQL statements in SQL Server? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.