Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Talking about the distinct statement and group by,order by of Mysql

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Recently, when I was working on a project, I found that the data obtained was quite different from what I expected. After a careful study, I found that the problem lies in the distinct statement and groupy by,order by.

First, the distinct statement to get a non-repetitive (unique) line note.

Grouy by is grouping and order by is sorting.

Just look at my example.

Suppose I have a table f_job with fields:

Select job_id, com_id,job_time from f_job order by job_time desc limit 10; T e:webwebphpfhrtee.txt

+-+

| | job_id | com_id | job_time | |

+-+

| | 5060 | 2205 | 2006-09-29 16:30:11 |

| | 4707 | 19084 | 2006-09-29 16:27:55 |

| | 4708 | 19084 | 2006-09-29 16:27:55 |

| | 4709 | 19084 | 2006-09-29 16:27:55 |

| | 4710 | 19084 | 2006-09-29 16:27:55 |

| | 4711 | 19084 | 2006-09-29 16:27:55 |

| | 4859 | 19084 | 2006-09-29 16:27:55 |

| | 4918 | 19084 | 2006-09-29 16:27:55 |

| | 5059 | 2205 | 2006-09-29 16:27:22 |

| | 4078 | 2715 | 2006-09-29 16:18:36 |

+-+

10 rows in set (0.03 sec)

There are other fields that cannot affect the result. It is not listed here.

Job_id is primary key. Com_id is a foreign key, and I need to sort it by time. So you must use order by!

You can see that com_id is not unique in the results. Yes, all I need is to extract the only records of the last 10 com_id of com_id.

Based on my previous MSSQL experience, I will write the following statement to execute:

Mysql > select distinct (com_id) from f_job order by job_time desc limit 10; T e:webwebphpfhrtee.txt

+-+

| | com_id |

+-+

| | 19084 |

| | 2197 |

| | 19917 |

| | 19580 |

| | 19520 |

| | 19664 |

| | 19397 |

| | 19900 |

| | 1176 |

| | 19449 |

+-+

Yes, the result this time is the only one, but this is not right. Obviously, if we lose the record that the com_id record number is 2005, the result must be wrong. I immediately analyzed all the results and found that the ignored com_id did not disappear, but was placed at the back.

I sort by time, the first result that should appear is 2205, why can it be in the back? The result is seen from two suspicious records:

Original record:

| | 5058 | 19580 | 2006-09-29 15:23:58 |

| | 5057 | 19917 | 2006-09-29 15:14:16 |

| | 4973 | 19580 | 2006-09-29 15:13:49 |

| | 5011 | 19580 | 2006-09-29 15:13:49 |

Order after distinct:

| | 19917 |

| | 19580 |

This shows that for interlaced records that are not together, if they happen to be interlaced, they can also be compared by order by, otherwise the comparison is no longer true, because the temporary value of the last cached order by is no longer useful objectively! In another case, if the records are connected, they can also be compared.

First of all, I understand MySql's retard.

The operation is performed immediately, and the result is as follows: mysql > select distinct (com_id), job_time from f_job order by job_time desc limit 10

+-+ +

| | com_id | job_time |

+-+ +

| | 2205 | 2006-09-29 16:30:11 |

| | 19084 | 2006-09-29 16:27:55 |

| | 2205 | 2006-09-29 16:27:22 |

| | 2715 | 2006-09-29 16:18:36 |

| | 2197 | 2006-09-29 16:03:16 |

| | 19580 | 2006-09-29 15:23:58 |

| | 19917 | 2006-09-29 15:14:16 |

| | 19580 | 2006-09-29 15:13:49 |

| | 19520 | 2006-09-29 10:29:41 |

| | 19900 | 2006-09-29 10:16:48 |

+-+ +

10 rows in set (0.10 sec)

Compare it with this first:

Mysql > select com_id,job_time from f_job order by job_time desc limit 10; T e:webwebphpfhrtee.txt

+-+ +

| | com_id | job_time |

+-+ +

| | 2205 | 2006-09-29 16:30:11 |

| | 19084 | 2006-09-29 16:27:55 |

| | 19084 | 2006-09-29 16:27:55 |

| | 19084 | 2006-09-29 16:27:55 |

| | 19084 | 2006-09-29 16:27:55 |

| | 19084 | 2006-09-29 16:27:55 |

| | 19084 | 2006-09-29 16:27:55 |

| | 19084 | 2006-09-29 16:27:55 |

| | 2205 | 2006-09-29 16:27:22 |

| | 2715 | 2006-09-29 16:18:36 |

+-+ +

10 rows in set (0.06 sec)

It turns out that distinct happens to process our information contrary to the results of the first test. He treats interlaced results as different output, which leads to the reappearance of 2205 of the records.

Then I used the group by sentence, which should be OK, but it also made me. :

Mysql > select com_id from f_job group by com_id order by job_time desc limit 10

+-+

| | com_id |

+-+

| | 19084 |

| | 2197 |

| | 19917 |

| | 19580 |

| | 19520 |

| | 19664 |

| | 19397 |

| | 19900 |

| | 1176 |

| | 19449 |

+-+

10 rows in set (0.03 sec)

This is the same as the result of distinct, which is not surprising, and then I join the group job_time and take a look:

Mysql > select com_id from f_job group by com_id,job_time order by job_time desc limit 10

+-+

| | com_id |

+-+

| | 2205 |

| | 19084 |

| | 2205 |

| | 2715 |

| | 2197 |

| | 19580 |

| | 19917 |

| | 19580 |

| | 19520 |

| | 19900 |

+-+

10 rows in set (0.03 sec)

This result is very close, and then I distinct (com_id) a bit, should be OK! Take a look:

Mysql > select distinct (com_id) from f_job group by com_id,job_time order by job_time desc limit 10; T e:webwebphpfhrtee.txt

+-+

| | com_id |

+-+

| | 19084 |

| | 2197 |

| | 19917 |

| | 19580 |

| | 19520 |

| | 19664 |

| | 19397 |

| | 19900 |

| | 1176 |

| | 19449 |

+-+

10 rows in set (0.04 sec)

Khan, there is no difference between this and no group by, how so weak! No way, what to do, hesitant.

The above question thoroughly illustrates the fact that distinct can only return its target field, not other fields.

Khan, there is no difference between this and no group by, how so weak! No way, what to do, hesitant.

-

Think of a very coquettish person-- phzzy, this jh,qq said, sure enough, he played php, immediately asked for help, after more than an hour of hard YY, finally this bird man ahead of me to give a sentence:

Select (`job_ id`), max (`job_ time`) from `f_ job`GROUP BY `com_ id` order by max (`job_ time`) limit 10

Mdgb, I finally got it. I got it as soon as I got this sentence.

I tmd know that this is twice sort, md,group by is once, and then no matter what, it is impossible to sort twice, because the second time must rely on the internal agglomeration function. Why didn't I think of max? it pisses me off [@ more@]

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 237

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report