How to analyze the problems caused by distinct questions 07/02 Update SLTechnology News&Howtos

How to analyze the problems caused by distinct questions

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

How to analyze the problems caused by distinct questions, in view of this problem, this article introduces the corresponding analysis and answers in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible way.

Someone has raised such a question and sorted it out for everyone's reference.

Suppose there is a table like this:

The data here has the following characteristics: in a DepartmentId, there may be multiple Name, and vice versa. That is to say, Name and DepartmentId are many-to-many relationships.

Now you want to implement a query that gets the unrepeated values of the Name column after the DepartmentID has been sorted (step 1), and then retains the relative order after the first step. In this case, the three values you should return are: ACB

We will first think of the following way of writing

Select distinct name from Sample order by DepartmentId

Semantically, this is natural. Unfortunately, this statement cannot be executed at all, and the error message is:

This error means that if DISTINCT is used, the fields that appear after OrderBy must also appear after SELECT, but if DepartmentID does appear after SELECT, there will obviously be no duplicate values, so the result must be wrong.

Select distinct name,DepartmentId from Sample order by DepartmentId

So, since there is a problem with the combination of DISINCT and OrderBy, is it possible for us to make some modifications, such as the following:

SELECT distinct a.NameFROM (select top 100percent name from Sample order by DepartmentId) a

To compare the previous writing, we used the subquery technique. Also from a semantic point of view, still hot is very intuitive and clear. I want to sort by DepartmentId before repeating the values. But the return result looks like this:

Although it is true that duplicate values are removed, they are returned in the wrong order. We want to sort by DepartmentId, then remove the duplicate values, and retain the relative order after sorting.

Why did the above result appear? It's because DISTINCT itself sorts, and this behavior cannot be changed (as you can see in the execution plan below). So in fact, the Order by we did before will lose its meaning here. [in fact, if you look at a similar query generated in an ORM tool such as ADO.NET Entity Framework, it will automatically discard the setting of Order by]

So, in this case, is it impossible to achieve the requirements? Although this requirement is rare, most of the time, it makes sense for DISTINCT to do a sort as the last operation.

I was thinking that since this behavior of DISTINCT is built-in, can it be bypassed? In the end, one solution I used was: can I number each Name, for example, if there are two A's, I will number the first An as 1, the second A 2, and so on. Then, when querying, I sort first, and then filter the Name numbered 1, which actually enables me to repeat the values.

SQL Server 2005 began to provide a ROW_NUMBER function, and with this feature, I implemented a query like this:

Select a.Name from (select top 100percentName,DepartmentId,ROW_NUMBER () over (partition by name order by departmentid) rowfrom Sample order by DepartmentId) awhere a.row=1order by a.DepartmentId

Then, I got the following result, which I reasoned out, which should be in line with the needs mentioned earlier.

In comparison, this query will be less efficient, and this is predictable (you can see a hint in the following figure). But if the requirements are rigid, it's not surprising to sacrifice some performance. Of course, we can study it again to see if there are any better ways to write it. In any case, implementations using built-in standards are usually relatively fast.

This is the end of the answer on how to analyze the questions caused by distinct questions. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.