How to solve the problem of pt-archiver and self-increasing primary key 07/19 Update SLTechnology News&Howtos

How to solve the problem of pt-archiver and self-increasing primary key

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)05/31 Report--

Today, the editor will share with you the relevant knowledge points about how to solve the problem of pt-archiver and self-increasing primary key. the content is detailed and the logic is clear. I believe most people still know too much about this knowledge, so share this article for your reference. I hope you can get something after reading this article.

Preface

Pt-archiver is a common tool for table cleaning or archiving.

You can use pt-archiver to delete all records in bulk before deleting large tables in MySQL. This helps avoid unexpected situations that may occur on your server in some cases, such as disk IO full leading to database hang or affecting normal SQL slow checking.

Question "when you delete data using pt-archiver, the last row of data is not deleted. Is this bug?"

Analysis.

Before we resolve the customer's problem, we need to explain why we use pt-archiver before deleting a large table. When we delete a table in MySQL, the MySQL system does the following:

Delete table data / index (ibd) and definition (frm) files.

Delete the trigger.

Update the table definition cache by deleting the table you want to delete.

Scan the InnoDB buffer pool for the associated page to invalidate it. -- tables that arrive in memory will encounter system hang.

It should be noted that DROP is a DDL statement that needs to hold a metadata lock (MDL) to complete, which causes all other threads to wait for DDL to complete, and cleaning up a large number of data pages related to the table puts additional pressure on the buffer pool.

Finally, the table_definition_cache operation requires LOCK_open mutex to clean up, which causes all other threads to wait until the deletion is complete.

To reduce the severity of this operation we can use pt-archiver to delete large amounts of data in bulk thereby significantly reducing the table size. Once we delete the record from the large table, the DROP operation will proceed quickly without affecting system performance.

Community members have noticed this behavior, and after the pt-archiver is complete, the table still has a row to process.

# Created tablemysql > CREATE TABLE `tt1` (`id` int (11) NOT NULL AUTO_INCREMENT, `a` char (5) DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB# Poured random test data into itmysql > call populate ('test','att1',10000,'N'); # Purged data using pt-archiver [root@centos_2] # pt-archiver-- source=h=localhost,D=test,t=tt1-- purge-- where "1room1" # Verifying count (expected 0, got 1) mysql > select count (*) from test.tt1 +-+ | count (*) | +-+ | 1 | +-+ 1 row in set (0.00 sec)

The same thing happens when we use pt-archiver with the-- no-delete parameter for data archiving. Our tool pt-archiver does not seem to copy the maximum value to the target table.

Migrate tables from tt1 to tt2 [root@centos_2 ~] # pt-archiver-- source=h=localhost,D=test,t=tt1-- dest=h=localhost,D=test,t=tt2-- no-delete-- where "1cm 1" mysql > select count (*) from tt2;+-+ | count (*) | +-+ | 5008 | +-+ 1 row in set (0.00 sec) mysql > select count (*) from tt1 +-+ | count (*) | +-+ | 5009 | +-+ 1 row in set (0.00 sec) resolution

Reading through the pt-archiver document, there is an option-[no] safe-auto-increment describes the usage: "do not use max AUTO_INCREMENT archive lines."

This means that the option-safe-auto-increment (default) adds an additional WHERE clause to prevent pt-archiver from deleting the latest row when promoting a single-column AUTO_INCREMENT, as shown in the following code section:

Https://github.com/percona/percona-toolkit/blob/3.x/bin/pt-archiver#L6449 if ($o-> get ('safe-auto-increment') & & $sel_stmt- > {index} & & scalar (@ {$src- > {info}-> {keys}-> {$sel_stmt- > {index}}-> {cols}}) = = 1 & & $src- > {info}-> {is_autoinc}- > {$src- > {info}-> {keys}-> {$sel_stmt- > {index}}-> {cols}-> [0]}) {my $col = $Q-> quote ($sel_stmt- > {scols}-> [0]) My ($val) = $dbh- > selectrow_array ("SELECT MAX ($col) FROM $src- > {db_tbl}"); $first_sql. = "AND ($col)

< " . $q->

Quote_val ($val). ();}

Let's look at the difference between the two commands by running the output empty:

# With-- no-safe-auto-increment [root@centos_2 ~] # pt-archiver-- source=h=localhost,D=test,t=tt1-- dest=h=localhost,D=test,t=tt2-- no-delete-- where "1mm 1"-- no-safe-auto-increment-- dry-runSELECT / *! 40001 SQL_NO_CACHE * / `id`, `a`FROM `test`.`tt1` FORCE INDEX (`PRIMARY`) WHERE (1x 1) ORDER BY `id`LIMIT 1SELECT / *! 40001 SQL_NO_CACHE * / `id`, `a`FROM `test`.`tt1` FORCE INDEX (`PRIMARY`) WHERE (1mm 1) AND ((`id` >?)) ORDER BY `id`LIMIT 1INSERT INTO `test`.`tt2` (`id`, `a`) VALUES (?,?) # Without-- no-safe-auto-increment (default) [root@centos_2 ~] # pt-archiver-- source=h=localhost,D=test,t=tt1-- dest=h=localhost,D=test,t=tt2-- no-delete-- where "1mm 1"-dry-runSELECT / *! 40001 SQL_NO_CACHE * / `id`, `a`FROM `test`.`tt1` FORCE INDEX (`PRIMARY`) WHERE (1q1) AND (`id`)

< '5009') ORDER BY `id` LIMIT 1SELECT /*!40001 SQL_NO_CACHE */ `id`,`a` FROM `test`.`tt1` FORCE INDEX(`PRIMARY`) WHERE (1=1) AND (`id` < '5009') AND ((`id` >

?)) ORDER BY `id` LIMIT 1INSERT INTO `test`.`tt2` (`id`, `a`) VALUES

Notice the additional clause "AND (id) above"

< '5009')" 了吗？如果服务器重新启动，–no-safe-auto-increment 的这个选项可以防止重新使用 AUTO_INCREMENT 值。请注意，额外的 WHERE 子句包含自归档或清除作业开始时自增列的最大值。如果在 pt-archiver 运行时插入新行，pt-archiver 将看不到它们。好吧，现在我们知道了为什么没有删除干净的"原因"，但为什么呢？AUTO_INCREMENT 的安全问题是什么？ AUTO_INCREMENT 计数器存储在内存中，当 MySQL 8.0之前的版本重新启动（崩溃或其他）时，计数器将重置为最大值。如果发生这种情况并且表正在接受写入，则 AUTO_INCREMENT 值将更改。 # deleting everything from tablemysql>

Delete from tt1;...mysql > show table status like 'tt1'\ gateway * 1. Row * * Name: tt1 Engine: InnoDB... Auto_increment: 10019. # Restarting MySQL [root @ centos_2 ~] # systemctl restart mysql# Verifying auto-increment countermeasure [root @ centos_2 ~] # mysql test-e "show table status like 'tt1'\ G" * * 1. Row * * Name: tt1 Engine: InnoDB... Auto_increment: 1...

The above test results tell us that the problem here is not really the pt-archiver, but the parameter options. When using pt-archiver when working with AUTO_INCREMENT columns, it is important to understand the use of the-no-safe-auto-increment option.

Let's verify it with our lab data.

# Verifying the usage of-no-safe-auto-increment option [root @ centos_2 ~] # pt-archiver-- source=h=localhost,D=test,t=tt1-- purge-- where "1room1"-- no-safe-auto-incrementmysql > select count (*) from test.tt1;+-+ | count (*) | +-+ | 0 | +-+ 1 row in set (0.00 sec)

The same is true for replication operations that use the-no-delete option.

[root@centos_2 ~] # pt-archiver-- source=h=localhost,D=test,t=tt1-- dest=h=localhost,D=test,t=tt2-- no-delete-- where "1mm 1"-- no-safe-auto-incrementmysql > select count (*) from tt1; select count (*) from tt2 +-+ | count (*) | +-+ | 5009 | +-1 row in set (0.00 sec) +-+ | count (*) | +-+ | 5009 | +-+ 1 row in set (0.00 sec)

Through the above code and the actual test, we know the principle and function of pt-archiver 's-[no] safe-auto-increment option. Before we come to the conclusion that everything is fine, let's think more about the meaning of the option itself.

By default, the-no-delete operation should include the-no-safe-auto-increment option. Currently, safe-auto-increment is the default behavior. When we use the-no-delete option of pt-archiver, there is no delete operation. This means that safe-auto-increment should not be a cause for concern.

For MySQL 8.0, the safe-auto-increment option is not required. Since MySQL 8.0, the value of self-increment is persistent, and the maximum value of self-increment remains the same after the instance is restarted or crashed.

And because MySQL 8.0 auto-increment is persisted by redoing logs, this makes them a reason why pt-archiver doesn't care. Therefore, we don't need the safe-auto-increment option at all.

These are all the contents of this article entitled "how to solve the problem of pt-archiver and self-adding primary keys". Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.