Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to delete duplicate records in a table by Oracle

2025-03-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

1 introduction

In the process of operating on the database, we may encounter this situation, the data in the table may be repeated, making it inconvenient for us to read in the operation of the database, so how to delete these repeated useless data?

In normal work, you may encounter when trying to create a unique index on a column or columns in the library table, the system prompts ORA-01452: cannot create a unique index, and duplicate records are found.

2 treatment process

There may be two cases of duplicate data: the first is that only some fields in the table are the same, and the second is that the two rows of records are exactly the same. There are also two kinds of results after deleting duplicate records, the first is to delete all the duplicate records, the second is to keep only the latest record in the duplicate records, and the second is more common in general business.

2.1 the principle of deleting duplicate records

(1) in Oracle, each record has a rowid, rowid is unique in the whole database, and rowid determines which data file, block, line each record is in Oracle.

(2) in duplicate records, the contents of all columns may be the same, but the rowid will not be the same, so just identify those with the largest rowid in the duplicate records and delete all the rest.

2.2 Delete some fields duplicate data 2.2.1 duplicate records delete all

If you want to delete duplicate data in some fields, you can delete it using the following statement, which deletes duplicate data in fields 1 and 2 in the table:

DELETE FROM table name a

WHERE (Field 1, Field 2)

IN (SELECT field 1, field 2

FROM table name

GROUP BY Field 1

Field 2

HAVING COUNT (1) > 1)

The above statement is very simple, that is, delete the queried data. However, this deletion is very inefficient and may hang the database for large amounts of data. Therefore, it is recommended that the duplicate data from the query be inserted into a temporary table and then deleted, so that there is no need to make another query when performing the deletion. As follows:

CREATE TABLE temporary table AS (select field 1, field 2, count (*) from table name group by field 1, field 2 having count (*) > 1)

The above sentence is to create a temporary table and insert the queried data into it. The following delete operation can be done:

Delete from table name a where field 1, field 2 in (select field 1, field 2 from temporary table)

This operation of building a temporary table before deleting is much more efficient than deleting it directly with a single statement.

Example:

DELETE FROM tmp_lhr t

WHERE (t.accesscode, t.lastserviceordercode, t.serviceinstancecode) IN

(SELECT a.accesscode, a.lastserviceordercode, a.serviceinstancecode

FROM tmp_lhr a

GROUP BY a.accesscode

A.lastserviceordercode

A.serviceinstancecode

HAVING COUNT (1) > 1)

2.2.2 keep the latest record

If you want to keep the latest record in the duplicate data! What are we going to do? In oracle, there is a hidden automatic rowid that gives each record a unique rowid. If we want to keep the latest record, we can use this field to keep the record with the largest rowid in the duplicated data.

First, how to find duplicate records?

SELECT *

FROM TABLE_NAME A

WHERE ROWID NOT IN (SELECT MAX (ROWID))

FROM TABLE_NAME D

WHERE A. Col1 = D.COL1

AND A. Col2 = D.COL2)

How to delete duplicate records? 1. Method 1

DELETE FROM TABLE_NAME

WHERE ROWID NOT IN (SELECT MAX (ROWID))

FROM TABLE_NAME D

Group by d.col1, d.col2)

This method is the easiest!

2. Method 2

DELETE FROM TABLE_NAME A

WHERE ROWID NOT IN (SELECT MAX (ROWID))

FROM TABLE_NAME D

WHERE A. Col1 = D.COL1

AND A. Col2 = D.COL2)

3. Method 3 temporary table

Therefore, if we want to delete the duplicate data and keep only the latest piece of data, we can write like this:

Create table temporary table as select a. Field 1, a. Field 2, MAX (a.ROWID) dataid from formal table a GROUP BY a. Field 1, a. Field 2

DELETE FROM official form a

Where a.rowid NOT IN (SELECT b.dataid

FROM temporary table b

WHERE a. Field 1 = b. Field 1

And a. Field 2 = b. Field 2)

Commit

Example:

DELETE FROM tmp_lhr t

WHERE t.rowid not in (SELECT MAX (ROWID))

FROM tmp_lhr a

GROUP BY a.accesscode

A.lastserviceordercode

A.serviceinstancecode)

DELETE FROM tmp_lhr t

WHERE t.rowid! =

(SELECT MAX (ROWID)

FROM tmp_lhr a

WHERE a.accesscode = t.accesscode

AND a.lastserviceordercode = t.lastserviceordercode

AND a.serviceinstancecode = t.serviceinstancecode)

2.2.3 Delete a record based on a field

-keep a record at will

DELETE FROM ods_entity_info_full_lhr_01 T

WHERE T.ROWID NOT IN (SELECT MAX (A.ROWID))

FROM ods_entity_info_full_lhr_01 A

GROUP BY entity_code

Entity_type)

-keep the largest record in entity_id

DELETE FROM ods_entity_info_full_lhr_01 a

WHERE a.rowid NOT IN

(SELECT t.rowid

FROM ods_entity_info_full_lhr_01 t

WHERE (t.entity_code, t.entity_type, t.entity_id) IN

(SELECT entity_code

Entity_type

MAX (entity_id)

FROM ods_entity_info_full_lhr_01

GROUP BY entity_code

Entity_type))

2.3 Delete a complete duplicate record

If the two rows of records in the table are exactly the same, you can get the deduplicated records in the following three ways:

1. Select distinct * from table name

2. Select * from table name group by column name 1, column name 2. Having count (*) > 1

3. Select * from table name a where rowid (SELECT MIN (X.ROWID))

FROM xr_maintainsite X

WHERE X.Maintainid = E.Maintainid

AND x.siteid = e.siteid);-- here all fields in the updated table need to be fully written.

2.4 use the row_number analysis function to retrieve duplicate records and delete records with serial numbers greater than 1

Give an example:

Delete from aa where rowid in (select rid from (select rowid rid,row_number () over (partition by name order by id) as seq from aa) where seq > 1)

3 Test cases

SYS@raclhr1 > CREATE TABLE T_ROWS_LHR_20160809 AS SELECT * FROM SCOTT.EMP

Table created.

SYS@raclhr1 > INSERT INTO T_ROWS_LHR_20160809 SELECT * FROM T_ROWS_LHR_20160809

14 rows created.

SYS@raclhr1 > COMMIT

Commit complete.

SYS@raclhr1 > INSERT INTO T_ROWS_LHR_20160809 SELECT * FROM T_ROWS_LHR_20160809

28 rows created.

SYS@raclhr1 > COMMIT

Commit complete.

SYS@raclhr1 > SELECT COUNT (1) FROM T_ROWS_LHR_20160809

COUNT (1)

-

fifty-six

SYS@raclhr1 > DELETE FROM T_ROWS_LHR_20160809

2 WHERE ROWID NOT IN (SELECT MAX (ROWID))

3 FROM T_ROWS_LHR_20160809 D

4 group by D. EMPNOCORIN D.ENAMERED. Job MagneD. MGR D.DEPTNO)

42 rows deleted.

SYS@raclhr1 > SELECT COUNT (1) FROM T_ROWS_LHR_20160809

COUNT (1)

-

fourteen

SYS@raclhr1 > COMMIT

Commit complete.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 293

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report