Solving the problem of data loss caused by Replace Operation in MySQL 07/03 Update SLTechnology News&Howtos

Solving the problem of data loss caused by Replace Operation in MySQL

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Preface

The developers of the company use the replace into statement when updating the data, which leads to a large amount of data loss due to improper use. This paper analyzes how it leads to the data loss.

I. explanation of the problem

The developers of the company use the replace into statement when updating the data, which leads to a large amount of data loss due to improper use. The analysis of how the data loss is caused is as follows.

Second, problem analysis

A. REPLACE principle

The official explanation of the REPLACE INTO principle is:

REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.

If the primary or unique key of the newly inserted row already exists in the table, the original record is deleted and the new row is inserted; if it does not exist in the table, insert it directly

Address: https://dev.mysql.com/doc/refman/5.6/en/replace.html

b. Problem phenomenon

The table structure for missing data is as follows:

CREATE TABLE `active_ items` (

The following replace statements are executed (multiple):

REPLACE INTO active_items (ad_id,score) VALUES ('XXXXXXX', 1800)

Find the execution record by querying binlog, in part as follows:

# UPDATE `items`.`active _ items` # @ 21300 / * TINYINT meta=0 nullable=0 is_null=0 * /

The ad_id of the operation already exists, so delete it first and then insert it. You can see that except for the specified ad_id,score, all other fields are changed to default values, resulting in the loss of the original data (although transferred to update in the log)

c. Comparative test

Next, I conducted the following tests:

When MySQL carries out Replace operation, it causes data loss by using REPLACE statement on the left and DELETE + INSERT statement on the right. The final result is exactly the same. The row with the original primary key id 1 is deleted, the newly inserted row primary key id is updated to 4, the field c with no specified content is inserted, and the default value is inserted to update a row of data with REPLACE. MySQL prompts that the number of affected rows is 2 rows.

To sum up, the description does delete a row and insert a row

III. Data recovery

After data loss or data error, you can recover it in the following ways:

The business side writes a script to restore the misoperation sql through the binlog of MySQL, and generates a reverse sql for data recovery (suitable for the case of a small amount of sql data) to restore the data state to the moment before the misoperation through historical backup files + incremental binlog.

IV. Problem expansion

From the above analysis, we can find that REPLACE deletes the old row and inserts the new row, but the record in binlog is in the form of update, which creates another problem:

The self-growth value of the slave library is less than that of the master library.

1. test

a. The principal and subordinate are consistent:

Main library:

Mysql > show create table tG

From the library:

Mysql > show create table tG

b. Main library REPLACE:

Main library:

Mysql > replace into t (a meme b) values (1m 7)

From the library:

Mysql > show create table tG

Notice that the AUTO_ index values of the master and slave tables are different at this time.

c. Simulate the slave master and INSERT in the slave library:

Mysql > insert into t (a dint bjorn c) values (4pm 4pm 4)

An error will be reported when inserting from the library, and the primary key is duplicated. After the error is reported, AUTO_ contains will be + 1, so the insert can be successfully executed again.

two。 Conclusion

This problem will not have the slightest impact in normal times, but:

If the master database usually uses a large number of REPLACE statements, the AUTO_INCREMENT value of the slave database lags behind that of the master database. When the master-slave switch occurs, when the new master database inserts data again, a large number of primary keys will repeat errors, resulting in data cannot be inserted.

Summary

The above is the whole content of this article. I hope the content of this article has a certain reference and learning value for everyone's study or work. Thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.