Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the problem of data duplication when using limit+order by in MySQL paging

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you about how to solve the problem of data duplication when using limit+order by in MySQL paging. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

0 problem description

In MySQL, we usually use limit to flip the page. For example, limit (0Magne10) lists 10 pieces of data on the first page, and limit (10Magin10) lists the second page. However, when limit encounters order by, there may be a record on the first page when he turns to the second page.

The details are as follows:

SELECT `post_ title`, `post_ date` FROM post WHERE `post_ status` = 'publish' ORDERBY view_countdesc LIMIT 5

When using the above SQL query, it is very likely that some record will appear that is the same as LIMIT 0Pol 5. If you use the following methods, there is no repetition:

SELECT * FROM post WHERE post_status='publish' ORDERBY view_countdesc LIMIT 5

However, because there are so many fields in the post table, I only want to use these two fields, and I don't want to find out the post_content. To solve this situation, two sorting criteria are used after ORDER BY to solve the problem, as follows:

SELECT `post_ title`, `post_ date` FROM post WHERE `post_ status` = 'publish' ORDERBY view_countdesc, IDasc LIMIT 5

In theory, the sorting of MySQL takes the primary key ID as the sorting condition by default, that is, if the primary key ID is the default sorting condition when the view_count is equal, there is no need for us to add ID asc. But the truth is that when MySQL is mixed with order by and limit, there is confusion in sorting.

1 analyze the problem

On MySQL 5.6, the optimizer makes an optimization when it encounters an order by limit statement, that is, it uses priority queue.

The purpose of using priority queue is that when index ordering cannot be used, if you want to sort, and limit n is used, then you only need to keep n records in the sorting process. Although this can not solve the overhead of sorting all records, it only needs a small amount of memory in sort buffer to complete sorting.

The reason why MySQL 5.6 has the problem of data duplication on the second page is that priority queue uses the sorting method of heap sorting, and heap sorting is an unstable sorting method, that is, the results sorted by the same value may not be consistent with the order of the data read.

MySQL 5.5 does not have this optimization, so this problem will not occur.

In other words, the problem mentioned in this article does not exist in MySQL 5.5, which did not occur until after version 5.6.

Then take a look at the order in which MySQL interprets the sql language:

(1) SELECT

(2) DISTINCT

(3) FROM

(4) JOIN

(5) ON

(6) WHERE

(7) GROUPBY

(8) HAVING

(9) ORDERBY

(10) LIMIT

The order of execution is form. Where... Select... Order by... Limit... Because of the above priority queue, after completing the select, all records are sorted in a heap sort, and only those with large view_count values are moved forward when order by is performed.

However, due to the factors of limit, only 5 records need to be retained in the sorting process, and view_count does not have index ordering, so when the data on the second page is to be displayed, mysql will take which one it sees. Therefore, when the sort value is the same, the first sort is arranged at will, and the second time the sql is executed, the result should be the same as the first result.

2 solution 1. Index sort field

If you add an index to the field, it is read and paged directly according to the order of the index, so that this problem can be avoided.

two。 Correct understanding of paging

Paging is based on sorting, and the quantity range is divided. Sorting is a function provided by the database, while paging is a derived application requirement.

The methods of limit n and rownum < n are provided in the official documentation of MySQL and Oracle, but the concept of paging is not clearly defined.

It is also important to note that although the above solution can alleviate the user's problem, according to the user's understanding, there are still problems: for example, this table is inserted more frequently, and when the user queries, the first and second pages will still overlap under the isolation level of read-committed.

Therefore, paging has always had this problem, and there are no very high accuracy requirements for data paging in different scenarios.

3. Some common database sorting problems

The scheduling problem without order by

When users use Oracle or MySQL, they find that MySQL is always orderly, but Oracle is very chaotic. This is mainly because Oracle is a heap table and MySQL is the reason for indexing clustered tables. So when there is no order by, the database does not guarantee the order of the return of records, and does not guarantee that the return will be consistent every time.

The question of paging repetition.

As described earlier, paging is an application requirement derived from the sorting function provided by the database, and the database does not guarantee the repetition of paging.

Null value and empty string problem

Different databases have different understanding and treatment of null values and empty strings, such as Oracle NULL and null values can not be compared, neither equal nor equal, is unknown. For empty strings, when inserting, MySQL is an empty string with a string length of 0, while Oracle handles null values directly.

The above is how to solve the problem of data duplication when using limit+order by in MySQL paging. If you happen to have similar doubts, please refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report