Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to query a record randomly from MySQL with high efficiency

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "how to query a record randomly from MySQL with high efficiency". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to query a record randomly from MySQL with high efficiency.

Write at the front

MySQL database is widely used in the Internet industry, and some partners may think that MySQL database is too small to store a lot of data. In fact, these little buddies really don't know MySQL. The small size of MySQL does not mean that it uses MySQL to store less data, but that it is small and lightweight. MySQL can be used to store hundreds of billions of levels of data. I will share with my friends how to use MySQL to store hundreds of billions of levels of data in a later article. Or my friends can book my new book "MySQL Technology: development, Optimization and Operation and maintenance" in advance. All right, with all that said, today I'd like to share with you a classic interview question about MySQL: how to query a record randomly from MySQL with the highest efficiency?

Interview questions

How to query a random record from a data table in MySQL while ensuring the highest efficiency.

From this point of view, it actually contains two requirements, the first of which is to query a random record from the MySQL data table. The second requirement is to ensure the highest efficiency.

Next, let's try a variety of ways to query data from the MySQL data table.

Method one

This is the most primitive and intuitive syntax, as follows:

SELECT * FROM foo ORDER BY RAND () LIMIT 1

This method is feasible when the amount of data in the data table is small. However, when the amount of data reaches a certain level, such as 1 million data or more, there are great performance problems. If you analyze this statement through EXPLAIN, you will find that although MySQL sorts by creating a temporary table, due to the characteristics of ORDER BY and LIMIT, we still can not get the required records through LIMIT until the sorting is completed. In other words, how many records you have, you must first sort the data.

Method two

It seems that for random data extraction with a large amount of data, the crux of the performance lies in ORDER BY, so how to avoid it? Method 2 provides a solution.

First, get the number of all records in the data table:

SELECT count (*) AS num_rows FROM foo

Then, record the total number of records (assumed to be num_rows) through the corresponding daemon.

Then execute:

SELECT * FROM foo LIMIT [a random number between 0 and num_rows], 1

The above random number can be obtained through the background program. The premise of this approach is that the ID of the table is continuous or self-growing.

This method has successfully avoided the generation of ORDER BY.

Method three

Is it possible to implement method two with a SQL statement instead of ORDER BY? Yes, that's using JOIN.

SELECT * FROM Bar B JOIN (SELECT CEIL (MAX (ID) * RAND () AS ID FROM Bar) AS m ON B.ID > = m.ID LIMIT 1

This method achieves our goal, and at the same time, in the case of a large amount of data, it also avoids the sorting process of all records caused by ORDER BY, because the SELECT statement in JOIN is actually executed only once, not N times (N equals num_rows in method 2). Moreover, we can add a "greater than" symbol to the filter statement and avoid the phenomenon that the record is empty because of the discontinuity of ID.

To query 5 non-duplicated data in MySQL, use the following:

SELECT * FROM `table` ORDER BY RAND () LIMIT 5

Just do it. But after a real test, it was found that the efficiency was very low. For a library of more than 150000 items, it takes more than 8 seconds to query 5 pieces of data.

Search Google, the Internet is basically to query max (id) * rand () to get data at random.

SELECT * FROM `table` AS T1 JOIN (SELECT ROUND (RAND () * (SELECT MAX (id) FROM `table`) AS id) AS T2 WHERE t1.id > = t2.id ORDER BY t1.id ASC LIMIT 5

But this will result in five consecutive records. The solution can only be queried one at a time, 5 times. Even so, it is worth it, because of the 150000 tables, the query takes less than 0.01s.

The above sentence is used by someone on the forum of JOIN,mysql.

SELECT * FROM `table` WHERE id > = (SELECT FLOOR (MAX (id) * RAND ()) FROM `table`) ORDER BY id LIMIT 1

I tested it, and it took 0.5 seconds, and the speed was good, but there was still a big gap with the above statement. I always feel that there is something wrong.

So I rewrote the sentence.

SELECT * FROM `table` WHERE id > = (SELECT floor (RAND () * (SELECT MAX (id) FROM `table`) ORDER BY id LIMIT 1

In this case, the efficiency has been improved again, and the query time is only 0.01 seconds.

Finally, refine the sentence and add the judgment of MIN (id). At the beginning of the test, because I didn't add the MIN (id) judgment, I always queried the first few rows in the table half the time.

The complete query statement is:

SELECT * FROM `table` WHERE id > = (SELECT floor (RAND () * ((SELECT MAX (id) FROM `table`)-(SELECT MIN (id) FROM `table`)) + (SELECT MIN (id) FROM `table`)) ORDER BY id LIMIT 1; SELECT * FROM `table`AS T1 JOIN (SELECT ROUND (RAND () * (SELECT MAX (id) FROM `table`)-(SELECT MIN (id) FROM `table`) + (SELECT MIN (id) FROM `table`) AS id) AS T2 WHERE t1.id > = t2.id ORDER BY t1.id LIMIT 1

Finally, the two statements are queried 10 times respectively.

The former takes 0.147433 seconds and the latter 0.015130 seconds.

It seems that using the syntax of JOIN is much more efficient than using functions directly in WHERE.

At this point, I believe you have a deeper understanding of "how to query a record randomly from MySQL with high efficiency". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report