How to check and remove the weight at one time of SQL 04/18 Update SLTechnology News&Howtos

How to check and remove the weight at one time of SQL

2025-04-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "how to check and remove duplicates in a SQL". The content is simple and clear. I hope it can help you solve your doubts. Let me lead you to study and learn this article "how to check duplicates and remove duplicates in a SQL".

Preface

When using SQL to retrieve data, we often encounter the need to remove duplicates when there are duplicate values in the table. This paper classifies the common methods.

⛳️1.distinct

Topic: now the operator needs to check which schools the users come from. Please take out the school's deduplicated data from the user information table.

Example: user_profile

Mysql > SELECT DISTINCT university FROM user_profile

? According to the example, the query returns the following results

? Tips:

The keyword distinct in SQL is deduplicated:

Distinct stands for a unique meaning in English.

He means to remove weight in SQL: for example, the column "university" appears twice in Peking University.

After using distinct for de-re-query, Peking University appears only once.

Distinct is generally inefficient

When distinct is in use, put it after select, and remove the duplicate values of all the subsequent fields.

? Expand:

Topic: now the operator needs to check the total number of users.

Select count (distinct university) from user_profile

⛳️2.group by

? For example, Chestnut, there is such a table task

Note:

Task_id: task id

Order_id: order id

Start_time: start Tim

Note: one task corresponds to multiple orders

Topic: list the total number of tasks

? According to the example, the query method is as follows:

Step 1: list all unique values of task_id (deduplicated record, null is also a value)

Select task_idfrom Taskgroup by task_id

Step 2: total tasks

Select count (task_id) task_numfrom (select task_idfrom Taskgroup by task_id) tmp; ⛳️3.row_number window function

? For example, Chestnut, there is such a table task

Note:

Task_id: task id

Order_id: order id

Start_time: start Tim

Note: one task corresponds to multiple orders

Topic: query the duplicate data of the whole table

? According to the example, the query method is as follows:

-used in sql that supports window functions

Select count (case when rn=1 then task_id else null end) task_numfrom (select task_id, row_number () over (partition by task_id order by start_time) rnfrom Task) tmp

? Tips:

Sorting can be realized by using three window functions ROW_NUMBER (), DENSE_RANK () and RANK () in MySQL8.0.

One thing to note is the alias after as. Do not repeat the alias with the previous function name, or you will report an error.

The following are examples of ranking achieved by these three functions:

-three sentences for the above three rankings

Select xuehao,score, ROW_NUMBER () OVER (order by score desc) as row_r from scores_tb;select xuehao,score, DENSE_RANK () OVER (order by score desc) as dense_r from scores_tb;select xuehao,score, RANK () over (order by score desc) as r from scores_tb

-one sentence can also query different rankings.

SELECT xuehao,score,ROW_NUMBER () OVER w AS 'row_r',DENSE_RANK () OVER w AS' dense_r',RANK () OVER w AS 'r'FROM scores_tbWINDOW w AS (ORDER BY score desc); delete duplicate data

Create test data

We create a personnel information table and insert some duplicate data into it.

CREATE TABLE Person (id int auto_increment primary key comment 'primary key', Name VARCHAR (20) NULL,Age INT NULL,Address VARCHAR (20) NULL,Sex CHAR (2) NULL) INSERT INTO Person (ID,Name,Age,Address,Sex) VALUES (1, 'Zhang San', 18, 'Beijing Road 18', 'male'), (2,'Li Si', 19, 'Beijing Road 29', 'male'), (3, 'Wang Wu', 19, 'Nanjing Road 11', 'female'), (4, 'Zhang San', 18, 'Beijing Road 18', 'male'), (5,'Li Si') 19, 29 Beijing Road, male), (6, Zhang San, 18, 18 Beijing Road, male), (7, Wang Wu, 19, 11 Nanjing Road, female), (8, Ma Liu, 18, 19 Nanjing Road, female)

Topic: there are duplicate records in the database. Delete and retain one of them.

We found that apart from the self-increasing ID, there are several pieces of data that are duplicated in other fields.

? Step 1: find the duplicate data mysql > SELECT MAX (ID) ID,Name,Age,Address,SexFROM PersonGROUP BY Name,Age,Address,SexHAVING COUNT (1) > 1

? Tips:

HAVING will group the number of data rows greater than 1, which is the duplicate data we are looking for.

Either the Max function or the Min function can be used above, just to ensure the uniqueness of the extracted data.

? Step 2: delete duplicate data

In fact, the last result to be retained in our database is the data queried in the second step.

We can just delete the other data.

How do I delete it? We use ID to rule out.

DELETE FROM PersonWHERE EXISTS (SELECT * FROM (SELECTMAX (ID) ID,Name,Age,Address,SexFROM PersonGROUP BY Name,Age,Address,SexHAVING COUNT (1) > 1) TWHERE Person.Name=T.NameAND Person.Age=T.AgeAND Person.Address=T.AddressAND Person.Sex=T.SexAND Person.ID

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.