Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the effects of limit on the performance of query statements in MySQL

2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces the impact of limit on the performance of query sentences in MySQL. It is very detailed and has a certain reference value. Friends who are interested must read it!

First, preface

First of all, explain the version of MySQL:

Mysql > select version (); +-+ | version () | +-+ | 5.7.17 | +-+ 1 row in set (.00 sec)

Table structure:

Mysql > desc test +-+ | Field | Type | Null | Key | Default | Extra | +-+- -+ | id | bigint (20) unsigned | NO | PRI | NULL | auto_increment | | val | int (10) unsigned | NO | MUL | | source | int (10) unsigned | NO | | | +-+-+ 3 rows in set (.00 sec) |

Id is a self-increasing primary key and val is a non-unique index.

Inject a large amount of data, a total of 5 million:

Mysql > select count (*) from test;+-+ | count (*) | +-+ | 5242882 | +-+ 1 row in set (4.25sec)

We know that efficiency problems arise when the offset in limit offset rows is large:

Mysql > select * from test whereval=4 limit 300000 Jol 5 +-+ | id | val | source | +-+ | 3327622 | 4 | 4 | 3327632 | 4 | 4 | 3327642 | 4 | 4 | 3327652 | 4 | 4 | 3327662 | 4 | 4 | +-+ -+ 5 rows in set (15.98 sec)

In order to achieve the same goal, we usually rewrite the following sentence:

Mysql > select * from test an inner join (select id from test whereval=4 limit 300000 on a.id=b.id 5) b on a.id=b.id +-+ | id | val | source | id | +-+ | 3327622 | 4 | 4 | 3327622 | 3327632 | 4 | 4 | 3327632 | | 3327642 | 4 | 4 | 3327642 | | 3327652 | 4 | | 4 | 3327652 | | 3327662 | 4 | 4 | 3327662 | +-+ 5 rows in set (.38 sec) |

The time difference is obvious.

Why did the above results appear? Let's take a look at the query process of select * from test whereval=4 limit 300000 5;:

Query the index leaf node data.

All the field values needed for the query on the index are clustered according to the primary key values on the leaf node.

This is similar to the following picture:

Like the above, you need to query the index node 300005 times, query the data of the clustered index 300005 times, and then filter out the first 300000 items and take out the last 5 items. MySQL consumes a large amount of random Iamp O in querying the data of clustered index, while the data queried by 300000 random Imax O will not appear in the result set.

Someone is bound to ask: since the index is used in the first place, why not query the last five nodes along the index leaf node, and then query the actual data in the clustered index. This only takes 5 random Ithumb O, similar to the process in the following image:

Actually, I want to ask the same question.

Confirm

Let's confirm the above inference in practice:

To confirm that select * from test whereval=4 limit 300000 5 scans data nodes on 300005 index nodes and 300005 clustered indexes, we need to know if MySQL has a way to count the number of times data nodes are queried through index nodes in a sql. I tried the Handler_read_* series first, and it's a pity that none of the variables meet the criteria.

I can only prove it indirectly:

There is buffer pool in InnoDB. It contains recently visited data pages, including data pages and index pages. So we need to run two sql to compare the number of data pages in buffer pool. The predicted result is that after running select * from test an inner join (select id from test whereval=4 limit 300000 Magi 5), the number of data pages in buffer pool is much less than that of select * from test whereval=4 limit 300000 Magi 5; the corresponding number, because the previous sql only accesses data pages five times, while the latter sql accesses data pages 300005 times.

Mysql > select index_name,count (*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in ('val','primary') and TABLE_NAME like'% test%' group by index_name;Empty set (.04 sec)

As you can see, there is currently no data page about the test table in buffer pool.

Mysql > select * from test whereval=4 limit 300000 Jol 5 +-+ | id | val | source | +-+ | 3327622 | 4 | 4 | 3327632 | 4 | 4 | 3327642 | 4 | 4 | 3327652 | 4 | 4 | 3327662 | 4 | 4 | +-+ -+ 5 rows in set (26.19 sec) mysql > select index_name Count (*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in ('val','primary') and TABLE_NAME like'% test%' group by index_name +-+-+ | index_name | count (*) | +-+-+ | PRIMARY | 4098 | val | 208 | +-+-+ 2 rows in set (.04 sec)

As you can see, there are 4098 data pages and 208 index pages about the test table in buffer pool at this time.

Select * from test an inner join (select id from test whereval=4 limit 300000 from test an inner join 5) in order to prevent the impact of the last test, we need to clear buffer pool and restart mysql.

Mysqladmin shutdown/usr/local/bin/mysqld_safe & mysql > select index_name,count (*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in ('val','primary') and TABLE_NAME like'% test%' group by index_name;Empty set (0.03 sec)

Run SQL:

Mysql > select * from test an inner join (select id from test whereval=4 limit 300000 on a.id=b.id 5) b on a.id=b.id +-+ | id | val | source | id | +-+ | 3327622 | 4 | 4 | 3327622 | 3327632 | 4 | 4 | 3327632 | | 3327642 | 4 | 4 | 3327642 | | 3327652 | 4 | | 4 | 3327652 | | 3327662 | 4 | 4 | 3327662 | +-+ 5 rows in set (sec) mysql > select index_name | Count (*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in ('val','primary') and TABLE_NAME like'% test%' group by index_name +-+-+ | index_name | count (*) | +-+-+ | PRIMARY | 5 | | val | 390 | +-+-+ 2 rows in set (0.03 sec)

We can clearly see the difference between the two: the first sql loads 4098 data pages into buffer pool, while the second sql loads only five data pages into buffer pool. It's in line with our prediction. It also confirms why the first sql is slow: reading a large number of useless data rows (300000) and then discarding it.

And this will cause a problem: loading a lot of hot data pages that are not very high to buffer pool will cause buffer pool pollution and take up buffer pool space.

Problems encountered

To ensure that buffer pool is cleared every time you restart, we need to turn off innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup, which control the data in dump out of buffer pool when the database is closed and the data loaded on disk to back up buffer pool when the database is turned on.

These are all the contents of this article entitled "what is the impact of limit on query performance in MySQL?" Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report