How to deal with duplicate data in MySQL database 07/19 Update SLTechnology News&Howtos

How to deal with duplicate data in MySQL database

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article will explain in detail how to deal with duplicate data in the MySQL database. The content of the article is of high quality, so the editor shares it for you as a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Method 1: prevent duplicate data

In other words, when we re-design the table, we should set a UNIQUE index on the data, so that it can be unique at the time of insertion, and there will be no duplicate data. Of course, you can also set it directly to PRIMARY KEY (primary key). The effect is the same. Let's look at a case: there are no indexes and primary keys in the following table, so the table allows multiple duplicate records.

CREATE TABLE student (first_name CHAR (20), last_name CHAR (20), sex CHAR (10))

Currently, first_name,last_name can be repeated, and if you don't want to repeat it, here are two solutions:

1. Set the double primary key mode

CREATE TABLE student (first_name CHAR (20) NOT NULL, last_name CHAR (20) NOT NULL, sex CHAR (10), PRIMARY KEY (last_name, first_name))

Duplicate data cannot be inserted now.

2. Add unique index

CREATE TABLE student (first_name CHAR (20) NOT NULL, last_name CHAR (20) NOT NULL, sex CHAR (10) UNIQUE (last_name, first_name))

The two forms seem to be a little different, but they can have the same effect. At this point, we can insert two duplicate pieces of data and we will find an error. Of course, we can also verify it in the database:

SELECT COUNT (*) as repetitions, last_name, first_name FROM student GROUP BY last_name, first_name HAVING repetitions > 1

Here we count the number of repeated records of first_name and last_name, which has been set in two ways, so it must be 0.

Method 2: specify whether duplicate data can be inserted at insert time

Here we are using the Insert ignore into and Insert into instructions.

(1) Insert ignore into will ignore the data that already exists in the database, insert new data if there is no data in the database, and skip this data if there is any data. In this way, the data that already exists in the database can be retained to achieve the purpose of inserting data in the gap.

(2) Insert into, on the other hand, directly inserts data, regardless of whether there is duplicate data in the database.

Let's give an example:

Insert ignore into student (last_name, first_name) values ('Zhang San','Li Si'); / / result Query OK, 1 rows affected (0.00 sec) insert ignore into student (last_name, first_name) values ('Zhang San','Li Si'); / / result Query OK, 0 rows affected (0.00 sec)

Now we can see that, that is to say, when the first insert operation is performed, if you see the database, a new record is inserted directly, so a row of records are affected, but at the second insert, the database already has the same entry, so it will not be inserted, and row 0 will be affected.

Of course, there is another instruction that can perform a similar function to insert ignore into, and that is replace into. What he means is that if there is a record with the same primary or unique, delete it first. And insert a new record.

Method 3: filter duplicate data

If you need to read non-duplicated data, you can use the DISTINCT keyword in the SELECT statement to filter the duplicate data.

SELECT DISTINCT last_name, first_name FROM student ORDER BY last_name; you can also use GROUP BY to read non-duplicated data in the data table: SELECT last_name, first_name FROM student GROUP BY (last_name, first_name)

Method 4: delete duplicate data

This situation is actually equivalent to solving it at the end of the water. Look at the following sql statement:

/ / create a temporary table based on student and filter duplicate data using group by

/ / create a temporary table based on student, and use group by to filter duplicate data CREATE TABLE tmp SELECT last_name, first_name, sex FROM student; GROUP BY (last_name, first_name); / / delete the original student table DROP TABLE student; / / rename the temporary table ALTER TABLE tmp RENAME TO stu

Of course, you can also add INDEX (index) and PRIMAY KEY (primary key) to the data table to delete duplicate records in the table. The methods are as follows:

ALTER IGNORE TABLE student ADD PRIMARY KEY (last_name, first_name); on how to deal with duplicate data in the MySQL database is shared here, I hope the above content can be helpful to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.