Optimization Scheme of MySQL million-level data paging query 04/15 Update SLTechnology News&Howtos

Optimization Scheme of MySQL million-level data paging query

2025-04-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

When there are tens of thousands of records in a table that needs to be queried from the database, querying all the results at once will become very slow, especially as the amount of data increases, so you need to use a paged query. For database paging queries, there are also many methods and optimization points. Let's briefly talk about some of the methods I know.

Preparatory work

To test some of the optimizations listed below, here is an illustration of an existing table.

Table name: order_history

Description: order history table for a business

Main field: unsigned int id,tinyint (4) int type

Field situation: the table has a total of 37 fields, excluding large arrays such as text, the maximum is varchar (500), and the id field is indexed and incremented.

Amount of data: 5709294

MySQL version: 5.7.16

It is not easy to find a million-level test table offline. If you need to test it yourself, you can write a shell script or insert data for testing.

The environment in which all of the following sql statements are executed remains unchanged. Here are the basic test results:

Select count (*) from orders_history

Returned result: 5709294

The time for the three queries are as follows:

8903 ms8323 ms8401 ms

General paging query

A general paging query can be implemented using a simple limit clause. The limit clause is declared as follows:

SELECT * FROM table LIMIT [offset,] rows | rows OFFSET offset

The LIMIT clause can be used to specify the number of records returned by the SELECT statement. Pay attention to the following points:

The first parameter specifies the offset of the first returned record row

The second parameter specifies the maximum number of record rows returned

If only one parameter is given: it represents the maximum number of rows of records returned

The second parameter-1 means to retrieve all record rows from a certain offset to the end of the recordset

The offset of the initial record row is 0 (not 1)

Here is an example of an application:

Select * from orders_history where type=8 limit 1000, 10

This statement will query the table orders_history for the 10 pieces of data after item 1000, that is, items 1001 to 10010.

Records in the data table are sorted using the primary key (usually id) by default, and the above result is equivalent to:

Select * from orders_history where type=8 order by id limit 10000 no. 10

The time for the three queries are as follows:

3040 ms3063 ms3018 ms

For this query method, the following tests the impact of the number of query records on time:

Select * from orders_history where type=8 limit 10000 camera 1 select * from orders_history where type=8 limit 10000 Magazine 10 select * from orders_history where type=8 limit 10000100 host select * from orders_history where type=8 limit 10000 Magistrate 1000 political select * from orders_history where type=8 limit 10000 10000

The time for the three queries is as follows:

Query 1 record: 3072ms 3092ms 3002ms query 10 records: 3081ms 3077ms 3032ms query 100records: 3118ms 3200ms 3128ms query 1000 records: 3412ms 3468ms 3394ms query 10000 records: 3749ms 3802ms 3696ms

In addition, I have done more than ten queries. From the point of view of the query time, it is basically certain that when the number of query records is less than 100, there is basically no difference in query time. As the number of query records becomes larger and larger, it will take more and more time.

Tests for query offsets:

Select * from orders_history where type=8 limit 100th select * from orders_history where type=8 limit 1000100 x select * from orders_history where type=8 limit 10000100 x select * from orders_history where type=8 limit 100000100 x select * from orders_history where type=8 limit 1000000100

The time for the three queries is as follows:

Query 1000 offset: 25ms 24ms 24ms query 1000 offset: 78ms 76ms 77ms query 10000 offset: 3092ms 3212ms 3128ms query 100000 offset: 3878ms 3812ms 3798ms query 1000000 offset: 14608ms 14062ms 14700ms

With the increase of query offset, especially when the query offset is greater than 100000, the query time increases sharply.

This paging query starts with the first record in the database, so the later, the slower the query speed, and the more data you query, the slower the overall query speed.

Using subquery optimization

This method first locates the id of the offset position, and then queries it back, which is suitable for the case where the id is incremented.

Select * from orders_history where type=8 limit 100000memoir 1 select id from orders_history where type=8 limit 100000memoir 1 position select * from orders_history where type=8 and id > = (select id from orders_history where type=8 limit 100000meme 1) limit 100th select * from orders_history where type=8 limit 100000100

The query time of the four statements is as follows:

Article 1 statement: 3674ms Article 2 statement: 1315ms Article 3 statement: 1327ms Article 4 statement: 3710ms

Note that for the above query:

Compare the first statement with the second statement: the speed of using select id instead of select * is increased by three times.

Compare the second statement with the third statement: the speed difference is tens of milliseconds.

Compare the third statement with the fourth statement: thanks to the increase in the speed of select id, the query speed of the third statement has been increased by three times

This method will be several times faster than the original general query method.

Use id to qualify optimization

This method assumes that the id of the data table is continuously increasing, then we can calculate the range of the id of the query based on the number of pages and records of the query, and we can use id between and to query:

Select * from orders_history where type=2 and id between 1000000 and 1000100 limit 100

Query time: 15ms 12ms 9ms

This query method can greatly optimize the query speed and can be completed in tens of milliseconds. The limitation is that it can only be used when you clearly know id, but usually when you create a table, you add a basic id field, which brings a lot of traversal to the paging query.

There can be another way to write it:

Select * from orders_history where id > = 1000001 limit 100

Of course, you can also use in to query, which is often used to query when multiple tables are associated, and to use the id collection of other table queries:

Select * from orders_history where id in (select order_id from trade_2 where goods = 'pen') limit 100

Note that some versions of mysql do not support the use of limit in the in clause in this in query.

Using temporary table optimization

This approach is no longer part of query optimization, which is incidentally mentioned here.

For problems in using id to limit optimization, id is required to be incremented continuously, but in some scenarios, such as when using history tables, or when there is a problem of missing data, you can consider using temporarily stored tables to record paged id and paging id for in queries. This can greatly improve the speed of traditional paging query, especially when the amount of data is tens of millions.

Id description of the datasheet

In general, when you create a table in a database, you force the addition of id increment fields for each table, which facilitates the query.

If the amount of data, such as an order database, is very large, a sub-database and sub-table will generally be carried out. At this time, it is not recommended to use the id of the database as the unique identity, but should use a distributed highly concurrent unique id generator to generate it, and use additional fields in the data table to store this unique identity.

Using a range query to locate the id (or index), and then using the index to locate the data, can increase the query speed several times. That is, first select id, then select *

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.