Summary of backup principle of mysql innobackupex 07/01 Update SLTechnology News&Howtos

Summary of backup principle of mysql innobackupex

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "the summary of the backup principle of mysql innobackupex", interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn the "summary of backup principles of mysql innobackupex"!

The official download address of xtrabackup is

Http://www.percona.com/software/percona-xtrabackup .

Xtrabackup contains two main tools, xtrabackup and innobackupex, which differ as follows:

1 xtrabackup can only back up the tables of innodb and xtradb engines, but not the tables of myisam engine. 2 innobackupex is a Perl script that encapsulates xtrabackup, which supports backing up innodb and myisam at the same time, but a global read lock is needed when backing up myisam. Also, myisam does not support incremental backups.

The backup process principle of innobackupex tool:!!

1: the backup process of the previous version of innobackupex is shown as follows:

The FTWRL (flush tables with read lock) here holds the lock for a long time, which is mainly related to the amount of data in the non-innodb table. If the non-innodb table has a large amount of data and the backup is slow, then the lock will be held for a long time. Even if they are all innodb tables, they will be locked for a certain amount of time because of the existence of mysql library system tables. In order to solve this problem, Percona has improved the Server layer of Mysql by introducing BACKUP LOCK (backup lock), specifically, obtaining consistency data (including non-innodb tables) through the "LOCK TABLES FOR BACKUP" command, and obtaining consistency points through "LOCK BINLOG FOR BACKUP" to minimize service disruption caused by database backup!

Https://www.percona.com/doc/percona-server/5.6/management/backup_locks.html#interaction-with-other-global-locks-the location of the official document

Second, the advantages of introducing backup locks

LOCK TABLES FOR BACKUP: block only non-transactional table operations! Does not block the dml of innodb tables

Function: to obtain consistent data

A) prohibit non-transactional engine (non-InnoDB) table writes (that is, DML).

B) disable DDL for all tables.

Advantages:

A) will not be blocked by large queries.

B) it is very important that the reading and updating of the innodb table will not be blocked. If the business tables are all innodb, the DML will not be damaged at all during the backup.

2.2 、 LOCK BINLOG FOR BACKUP:

Function: to obtain the consistency site.

A) prohibit the operation of binlog sites (DML, DDL are not allowed)

Advantages:

A) the time is short and the impact on db is very small.

Third, the specific innobackupex backup process:

3.1. the process of the lower version of innobackupex (= 2.2.0 version): (backup lock has been added and FLUSH TABLES WITH READ LOCK is no longer used)

1.get Redo LSN

2.copy system tablespace + transaction engine table data file + backstage child process (IBACKUP) copy Redo

3.LOCK TABLES FOR BACKUP (redo has been copied at this time)

4.copy all * .frm files, non-transactional engine table (MyISAM, ARCHIVE, etc.) data + index files (redo has been copied at this time)

5.LOCK BINLOG FOR BACKUP (in order to get consistent binlog points, all operations that need to write binlog cannot be performed)

6.finalize the background copy of REDO log (including flush redo bufer to disk)

7.get the binary log coordinates (coordinates / site)

8.UNLOCK BINLOG

9.UNLOCK TABLES

Note:

1): avoid performing backup tasks during peak business hours:

If, after the completion of the fourth step, you start to execute LOCK BINLOG FOR BACKUP, get the binlog lock, and then SHOW MASTER STATUS to obtain the consistent binlog, and then start redo in flush redo buffer to disk (specific command: FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS), in order to copy all redo, there will be a problem.

If your redo buffer is relatively large, and there are a lot of redo generated during the fourth step of copying non-transactional tables, this will result in a longer time in step 6, which in turn leads to a longer time to hold binlog backup locks, resulting in longer blocking time in the database and longer business impact time, so you need to perform backup tasks when the business is less!

2): under mysql RR and RC isolation levels, LOCK table tablename WRITE will block the execution of LOCK TABLES FOR BACKUP, but neither lock table tablename read nor LOCK TABLES FOR BACKUP will block the execution of LOCK TABLES FOR BACKUP!

The specific test process:

Session 1 executes lock table t write!

Root@localhost: liuwenhe 20:48:15 > LOCK table t WRITE

Query OK, 0 rows affected (0.04 sec)

Then another window performs the backup: it is found that the card is in Executing LOCK TABLES FOR BACKUP., and has been reading redo with the same lsn number (56989476423)!

[root@beijing-fuli-hadoop-04 ~] # innobackupex-uroot-paired Venture 1qaz' / data/backup/

200310 21:00:17 [01] Copying. / sys/sys_config.ibd to / data/backup/2020-03-10 hours 21-00-09/sys/sys_config.ibd

200310 21:00:17 [01]... done

200310 21:00:17 Executing LOCK TABLES FOR BACKUP...

200310 21:00:17 > > log scanned up to (56989476423)

200310 21:00:18 > > log scanned up to (56989476423)

200310 21:00:19 > > log scanned up to (56989476423)

200310 21:00:20 > > log scanned up to (56989476423)

At this point, I checked the process and found that LOCK TABLES FOR BACKUP was waiting for Waiting for backup lock.

Root@localhost: (none) 17:33:17 > show processlist

+-+-- + -- +

+-+-- + -- +

| | 3 | system user | | NULL | Connect | 100359 | Waiting for master to send event | NULL | 0 | 0 |

| | 4 | system user | | NULL | Connect | 75664 | Slave has read all relay log; waiting for more updates | NULL | 0 | 0 |

| | 17 | root | localhost | liuwenhe | Sleep | 0 | | NULL | 0 | 0 |

+-+-- + -- +

5 rows in set (0.00 sec)

The dml operation will not be committed and will not block, in fact, it is because the redo generated by transactions without commit does not need to be backed up, because the redo is not needed for recovery!

The specific test process:

Session1: performing dml operations does not commit

Root@localhost: liuwenhe 17:34:44 > start transaction

Query OK, 0 rows affected (0.00 sec)

Root@localhost: liuwenhe 17:35:59 > delete from liu where id=100

Query OK, 1 row affected (0.07 sec)

Then start to perform a backup and find that it can be performed! Indicates that individual dml operations are neither committed nor blocked

[root@beijing-fuli-hadoop-04 ~] # innobackupex-uroot-paired Venture 1qaz' / data/backup/

Xtrabackup: Transaction log of lsn (53883827670) to (53883827695) was copied.

200310 17:40:24 completed OK!

Conclusion: LOCK table tablename WRITE will block the execution of LOCK TABLES FOR BACKUP, but

Neither lock table tablename read nor LOCK TABLES FOR BACKUP will block the execution of LOCK TABLES FOR BACKUP! Be sure to note that when the data file backed up by your mysqldump contains lock table name write, be careful not to conflict with the physical backup, causing the physical backup to be blocked and the execution time is too long

Four: open genary log before performing innobackupex backup

See the following specific process:

2020-03-05T12:20:40.187735Z 221 Connect root@localhost on using Socket

2020-03-05T12:20:40.188456Z 221 Query set autocommit=1

2020-03-05T12:20:40.194768Z 221 Query SET SESSION wait_timeout=2147483

2020-03-05T12:20:40.224959Z 221 Query SELECT CONCAT (@ @ hostname, @ @ port)

2020-03-05T12:20:40.245466Z 221 Quit

2020-03-05T12:20:40.257504Z 222Connect root@localhost on using Socket

2020-03-05T12:20:40.257854Z 222Query SET SESSION wait_timeout=2147483

2020-03-05T12:20:40.258527Z 222Query SET SESSION autocommit=1

2020-03-05T12:20:40.258759Z 222Query SET NAMES utf8

2020-03-05T12:20:40.258983Z 222Query SHOW VARIABLES

2020-03-05T12:20:40.280937Z 222Query SHOW ENGINE INNODB STATUS

2020-03-05T12:20:40.465253Z 222Query SELECT PLUGIN_NAME, PLUGIN_LIBRARY FROM information_schema.plugins WHERE PLUGIN_STATUS = 'ACTIVE' AND PLUGIN_TYPE =' KEYRING'

2020-03-05T12:20:40.467169Z 222Query SELECT

CONCAT (table_schema,'/', table_name), engine

FROM information_schema.tables

WHERE engine NOT IN (

'MyISAM', 'InnoDB',' CSV', 'MRG_MYISAM'

)

AND table_schema NOT IN (

'performance_schema', 'information_schema',' mysql'

)

2020-03-05T12:20:51.604076Z 222Query SET SESSION lock_wait_timeout=31536000

2020-03-05T12:20:51.604317Z 222Query LOCK TABLES FOR BACKUP

2020-03-05T12:20:54.525210Z 222Query LOCK BINLOG FOR BACKUP

2020-03-05T12:20:54.525306Z 222Query SHOW MASTER STATUS

2020-03-05T12:20:54.525417Z 222Query SHOW VARIABLES

2020-03-05T12:20:54.759369Z 222Query FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS

2020-03-05T12:20:54.968260Z 222Query UNLOCK BINLOG

2020-03-05T12:20:54.968348Z 222Query UNLOCK TABLES

2020-03-05T12:20:55.003416Z 222Query SELECT UUID ()

2020-03-05T12:20:55.017367Z 222Query SELECT VERSION ()

2020-03-05T12:20:55.245540Z 222Quit

Note:

1): first, modify the lock_wait_timeout parameter at the session level (the default is 31536000 seconds) to ensure that the transaction will not timeout due to the long waiting time for the metadata lock after the lock binlog for backup is executed. Gets the timeout of the metadata lock, such as alter table. This is suitable for all tables except system tables (except the mysql library)

2): the function of FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS is to persist the redo log of the innodb layer to disk, but does not trigger binlog buffer to rinse to disk; then there will be related processes to copy redo.

To put it bluntly, after all the transactional and non-transactional table backups are completed, the global read lock is obtained, and the pos of binlog is obtained by using the show master status statement, the log in redo log buffer is refreshed to disk, and then the redo log copy thread copies the final redo log log data. Because when performing LOCK BINLOG FOR BACKUP to get the binlog backup lock until the unlock BINLOG release lock, there will be no more requests (all operations to write binlog can not be carried out), ensuring that the binlog can not be changed, thus ensuring a consistent binlog point!

3): about whether to UNLOCK BINLOG first or UNLOCK TABLES first?

What the official document sees is: unlock binlog before unlock tables.

But what you see in genary log is: unlock binlog first and then unlock tables:

2020-03-05T12:20:51.604076Z 222Query SET SESSION lock_wait_timeout=31536000

2020-03-05T12:20:51.604317Z 222Query LOCK TABLES FOR BACKUP

2020-03-05T12:20:54.525210Z 222Query LOCK BINLOG FOR BACKUP

2020-03-05T12:20:54.525306Z 222Query SHOW MASTER STATUS

2020-03-05T12:20:54.525417Z 222Query SHOW VARIABLES

2020-03-05T12:20:54.759369Z 222Query FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS

2020-03-05T12:20:54.968260Z 222Query UNLOCK BINLOG

2020-03-05T12:20:54.968348Z 222Query UNLOCK TABLES

2020-03-05T12:20:55.003416Z 222Query SELECT UUID ()

2020-03-05T12:20:55.017367Z 222Query SELECT VERSION ()

2020-03-05T12:20:55.245540Z 222Quit

So who gets to the ground first and who comes after?

First of all, their respective goals:

LOCK TABLES FOR BACKUP: to get consistent data, block modifications to non-innodb tables, it does not block

Innodb table modification, by backing up the innodb table modification generated redo log to achieve consistency!

LOCK BINLOG FOR BACKUP: in order to get a consistent gtid point, the modification operations of all tables (including the innodb table) are blocked, that is, all operations to write binlog are blocked, so that the binlog does not change, and then a consistent binlog point can be obtained!

As you can see, the scope of binlog backup locks (blocking all table modifications) is wider than that of lock tables for backup (blocking only non-innodb table modifications), so I think that when LOCK BINLOG FOR BACKUP is executed, in principle, the lock LOCK TABLES FOR BACKUP can be released (because lock binlog can play the role of lock tables for backup). If there is no unlock binlog, even if you unlock tables, then the whole instance still cannot do any write binlog operation. So if this order is not taken into account when mysql is designed, then there is no definite order, that is, it can be implemented first.

If so, unlock binlog waits for show master status to complete, and flush redo log completes and copies redo, while unlock tables waits for copying all * .frm files, non-transactional engine table (MyISAM, ARCHIVE, etc.) data + index files, and LOCK BINLOG FOR BACKUP before unlock tables, if there are very few database operations during backup Then show master status and flush redo log complete and copy redo very quickly, so unlock binlog may finish before unlock tables. If the database is busy, then unlock tables first and then unlock binlog!

Unfortunately, when I simulate a large number of insert operations and back up at the same time, it turns out that the general log is still unlock binlog before unlock tables, so I am very confused about the accuracy of official document writing!

At this point, I believe that you have a deeper understanding of the "summary of backup principles of mysql innobackupex", you might as well come to the actual operation! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.