Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to optimize SQL

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article focuses on "how to optimize SQL". Friends who are interested may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to optimize SQL.

Three reasons for slow execution of SQL statements

No index is established, or index failure results in slow execution of SQL statements

This should be easier to understand. If there is more data, above the 10 million level, and then there is no index, to find what you want in this 10 million level of data is simply hand-to-hand combat (ouch, it's amazing. I dare to fight hand-to-hand.

There is a lot of talking about the content of index failure, for example, when querying, let the like wildcard come first, such as the "leftmost matching principle" that we often talk about, or, for example, we use or in the query condition, and there is no index in a column before and after or, and so on. All these situations will lead to index invalidation.

Lock waiting

The commonly used storage engines are InnoDB and MyISAM. The former supports row locks and table locks, while the latter only supports table locks.

If all database operations are based on table locks, that is to say, if there is an update operation, the whole table will be locked, then query operations are not allowed, so let alone improve the concurrency performance of the system.

If you are smart, you must know that since MyISAM only supports table locks, why not use InnoDB? Do you think InnoDB's row lock will not be upgraded to a table lock? too young too simple!

If you do a lot of updates to a table, mysql thinks that using it in this way will reduce the efficiency of the transaction, and in the end, it will lead to performance degradation, so you might as well upgrade your row lock to a table lock.

In addition, a row lock is an index-based lock, and when an update operation is performed, the conditional index is invalid, then the lock will also be upgraded from a row lock to a table lock.

Inappropriate SQL statement

This is also quite common, what is an inappropriate SQL statement? For example, obviously what you are looking for is name, age, but, to save trouble, directly select *, or in the case of order by, the latter condition is not an index field, which is an inappropriate SQL statement

Optimize SQL statement

After knowing the reason why the execution of SQL statements is slow, the next thing to do is to prescribe the right medicine to the case.

For the block with no index / index invalidation, the most effective way is EXPLAIN syntax, do you know that Show Profile is also OK?

There is no way to wait for the lock, so we can only pay more attention to it.

Aiming at the inappropriate SQL statement, this paper introduces several commonly used SQL optimizations, such as how to optimize the paging query to make the query faster. Didn't you say that select * is not the right way to open it? So what is the right way to select? Don't worry, A Fan will talk about it below.

No more nonsense. Here we go.

Let's start with a watch.

In order to ensure that the optimized result is the same as what I wrote (at least 90% is consistent

So let's use the same database, okay? Be good.

First, build a database of demo.

Next, let's build a table, let's build a very simple table, shall we?

CREATE TABLE demo.table (id int (11) NOT NULL, an int (11) DEFAULT NULL, b int (11) DEFAULT NULL, PRIMARY KEY (id)) ENGINE = INNODB

Then insert 100000 pieces of data

DROP PROCEDURE IF EXISTS demo_insert; CREATE PROCEDURE demo_insert () BEGIN DECLARE i INT; SET i = 1; WHILE i const > eq_ref > ref > range > index > ALL

System/const: indicates that there is only one row of data matching in the table. At this time, the corresponding data can be found by querying the index once.

Eq_ref: use unique index scanning, which is often found in multi-table joins, as can be seen when primary keys and unique indexes are used as association conditions

Ref: non-unique index scan, which can also be seen in the leftmost matching scan of the unique index

Range: index range scan, such as query conditions used

< , >

, between and other conditions

Index: the index is scanned throughout the table, and the entire index tree is traversed at this time

ALL: indicates a full table scan, that is, you need to traverse the entire table to find the corresponding row

Possible_keys: indicates the index that may be used

Key: the index actually used

Key_len: index length used

Ref: information such as associated id

Rows: the number of rows scanned when the condition is found. Although there are 100000 rows of data here, 99 rows of data are scanned because of the index.

Extra: additional information, the following are common

Using where: instead of reading all the information in the table, you only need to get the required data through the index. This process occurs when all the request columns on the table are in the same index part.

Using temporary: indicates that mysql requires temporary tables to store result sets, which is common in group by / order by

Using filesort: when the query statement contains the order by operation, and the content after order by is not an index, so there is no way to use the index to complete sorting, it will use "file sorting". As shown in the example, the index is id, but my query statement order by is followed by a, so there is no way to use the index.

Using join buffer: connection cache used

Using index: overlay index is used

If you have a good understanding of these parameters, you won't be bothered by EXPLAIN.

Analysis of SQL execution performance under Show Profile

Analyzing the execution plan through EXPLAIN can only explain the external execution of SQL. If you want to know how mysql is queried, you need to analyze it through Show Profile.

You can query the recent SQL statement sent to the server through the SHOW PROFILES; statement. By default, it records the 15 records that have been executed recently, as shown in the following figure:

I want to see a specific sentence. Have you seen Query_ID? Then run SHOW PROFILE FOR QUERY 82; this command is fine:

As you can see, Sending data takes the longest time in the result, because the mysql thread starts to read the data and returns it to the client, and there will be a large number of disk Igo O operations in the process.

Through this kind of analysis, we can know whether the SQL statement affects the query speed in the query process, or the System lock affects the query speed, knowing where the disease lies, and then it will be much easier to prescribe the right remedy to the case.

How can paging queries be faster? when using paging queries, the limit keyword is used.

But for paging queries, you can actually optimize one more step.

The database I give here is not very good, because it is too simple to see the difference. I use the table currently being used on the project to do an experiment to see the difference (the SQL statement used is as follows):

EXPLAIN SELECT * FROM `te_paper_ record` ORDER BY id LIMIT 10000, 20; EXPLAIN SELECT * FROM `te_paper_ record` WHERE id > = (SELECT id FROM `te_paper_ record` ORDER BY id LIMIT 10000, 1) LIMIT 20

In the above picture, I did not use a subquery. I can see that 0.033s was executed. In the following query statement, I used a subquery to optimize. I can see that 0.007s was executed, and the optimization result is still very obvious.

So, why is the speed of the query raised when the subquery is used? this is because when we do not use the subquery, the 10020 rows of data queried are returned, and then the 10020 rows of data are filtered.

Can you directly return the 20 rows of data you need, so that you don't need to filter any more, just return it directly?

You're so smart. That's what the subquery is doing.

So there is a big optimization in query time.

The correct way to open select

In order to save trouble, we sometimes use SQL statements such as select * from table where id = 1 directly when querying, but this kind of writing will cause some performance loss in some environments.

So the best select query is to query whatever fields you need.

Generally speaking, when querying, there will be conditions and search according to the conditions.

What is the correct way to open select at this time?

If the primary key index can be passed, according to the condition after where, the primary key index is preferred.

Why? You need to know the storage rules of MySQL.

The commonly used storage engines of MySQL are MyISAM and InnoDB. InnoDB creates a primary key index, while the primary key index belongs to a clustered index, that is, when storing data, the index is based on a B + tree, and the specific row data is stored in the leaf node.

In other words, if the query is made through the primary key index, the B+ tree will be searched directly to find the data.

If you do not query through the primary key index, you need to search the index tree first to get the value on the B + tree, and then search for qualified data on the B + tree. This process is to "return to the table".

Obviously, returning to the table can generate time.

This is why it is recommended that the primary key index be preferred after the condition of where.

Other tuning

After reading the above, you should have a good idea. SQL tuning is mainly about indexing / preventing lock waiting / using appropriate SQL statements to query.

However, if you are asked if there are any other tuning methods besides indexing, in addition to the above

What? There's more?!

Yes, this needs to jump out. Don't confine it to specific SQL statements. You need to consider it at the beginning of database design.

For example, we often say to follow the three paradigms, but in some business scenarios, if there are a few redundant fields in the database, the performance may be much better than strictly following the three paradigms.

At this point, I believe you have a deeper understanding of "how to optimize SQL". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report