In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
In this issue, the editor will bring you about how to solve the problem of data duplication when using limit+order by in MySQL paging. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.
0 problem description
In MySQL, we usually use limit to flip the page. For example, limit (0Magne10) lists 10 pieces of data on the first page, and limit (10Magin10) lists the second page. However, when limit encounters order by, there may be a record on the first page when he turns to the second page.
The details are as follows:
SELECT `post_ title`, `post_ date` FROM post WHERE `post_ status` = 'publish' ORDERBY view_countdesc LIMIT 5
When using the above SQL query, it is very likely that some record will appear that is the same as LIMIT 0Pol 5. If you use the following methods, there is no repetition:
SELECT * FROM post WHERE post_status='publish' ORDERBY view_countdesc LIMIT 5
However, because there are so many fields in the post table, I only want to use these two fields, and I don't want to find out the post_content. To solve this situation, two sorting criteria are used after ORDER BY to solve the problem, as follows:
SELECT `post_ title`, `post_ date` FROM post WHERE `post_ status` = 'publish' ORDERBY view_countdesc, IDasc LIMIT 5
In theory, the sorting of MySQL takes the primary key ID as the sorting condition by default, that is, if the primary key ID is the default sorting condition when the view_count is equal, there is no need for us to add ID asc. But the truth is that when MySQL is mixed with order by and limit, there is confusion in sorting.
1 analyze the problem
On MySQL 5.6, the optimizer makes an optimization when it encounters an order by limit statement, that is, it uses priority queue.
The purpose of using priority queue is that when index ordering cannot be used, if you want to sort, and limit n is used, then you only need to keep n records in the sorting process. Although this can not solve the overhead of sorting all records, it only needs a small amount of memory in sort buffer to complete sorting.
The reason why MySQL 5.6 has the problem of data duplication on the second page is that priority queue uses the sorting method of heap sorting, and heap sorting is an unstable sorting method, that is, the results sorted by the same value may not be consistent with the order of the data read.
MySQL 5.5 does not have this optimization, so this problem will not occur.
In other words, the problem mentioned in this article does not exist in MySQL 5.5, which did not occur until after version 5.6.
Then take a look at the order in which MySQL interprets the sql language:
(1) SELECT
(2) DISTINCT
(3) FROM
(4) JOIN
(5) ON
(6) WHERE
(7) GROUPBY
(8) HAVING
(9) ORDERBY
(10) LIMIT
The order of execution is form. Where... Select... Order by... Limit... Because of the above priority queue, after completing the select, all records are sorted in a heap sort, and only those with large view_count values are moved forward when order by is performed.
However, due to the factors of limit, only 5 records need to be retained in the sorting process, and view_count does not have index ordering, so when the data on the second page is to be displayed, mysql will take which one it sees. Therefore, when the sort value is the same, the first sort is arranged at will, and the second time the sql is executed, the result should be the same as the first result.
2 solution 1. Index sort field
If you add an index to the field, it is read and paged directly according to the order of the index, so that this problem can be avoided.
two。 Correct understanding of paging
Paging is based on sorting, and the quantity range is divided. Sorting is a function provided by the database, while paging is a derived application requirement.
The methods of limit n and rownum < n are provided in the official documentation of MySQL and Oracle, but the concept of paging is not clearly defined.
It is also important to note that although the above solution can alleviate the user's problem, according to the user's understanding, there are still problems: for example, this table is inserted more frequently, and when the user queries, the first and second pages will still overlap under the isolation level of read-committed.
Therefore, paging has always had this problem, and there are no very high accuracy requirements for data paging in different scenarios.
3. Some common database sorting problems
The scheduling problem without order by
When users use Oracle or MySQL, they find that MySQL is always orderly, but Oracle is very chaotic. This is mainly because Oracle is a heap table and MySQL is the reason for indexing clustered tables. So when there is no order by, the database does not guarantee the order of the return of records, and does not guarantee that the return will be consistent every time.
The question of paging repetition.
As described earlier, paging is an application requirement derived from the sorting function provided by the database, and the database does not guarantee the repetition of paging.
Null value and empty string problem
Different databases have different understanding and treatment of null values and empty strings, such as Oracle NULL and null values can not be compared, neither equal nor equal, is unknown. For empty strings, when inserting, MySQL is an empty string with a string length of 0, while Oracle handles null values directly.
The above is how to solve the problem of data duplication when using limit+order by in MySQL paging. If you happen to have similar doubts, please refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.