How to realize binlog Optimization in MYSQL 04/27 Update SLTechnology News&Howtos

How to realize binlog Optimization in MYSQL

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

This article mainly explains how to achieve binlog optimization in MYSQL, the content is clear, interested friends can learn, I believe it will be helpful after reading.

problem

Question 1: how to solve the performance loss caused by flush redo log when a transaction is committed

WAL is a common technology to achieve transaction persistence (D). The basic principle is to record the changes of the transaction redo log. Redo log sequentially appends writes. When a transaction is committed, it is only necessary to ensure that the redo log of the transaction is down, and the performance of the database system is improved through the sequential writing of redo log instead of the random write of the page. However, this scheme must require each transaction to flush its generated redo log once when it is committed, which is not efficient.

Problem 2:binlog and the order of engine layer transaction commit

For a single transaction, the order in which logs are written is redo log and then binlog, and correctness can be maintained as long as the order is maintained. But for a highly concurrent database system, there may be a large number of concurrent transactions all the time. We also need some means to maintain the order consistency of transaction commit between the Server layer binlog and the engine layer.

The purpose of maintaining this sequential consistency is to ensure the correctness of the backup tool Xtrabackup.

When binlog acts as the coordinator, there may be holes in the location where the backup set is obtained by the backup tool (Innodb Hot Backup) if the order of transactions recorded in it is different from the order in which the engine layer records are stored. Because the backup tool copies the redo log, the redo header records the binlog point corresponding to the last committed transaction, and after the backup set is established, it continues to dump binlog from the master library.

If there are three transactions, T1 and T3, which have been fsync into the binlog file, the location of the three transactions in the file is 100, respectively, but only T1 and T3 in the engine layer have completed the commit and recorded in redo, and the last transaction T3 of commit is 300. At this point, the data obtained through the backup tool is like this. When the backup set starts, it goes through the process of crash recovery, and the prepare transaction is rolled back (the backup set does not back up the binlog file, corresponding to the empty xid set in the previous section). Since point 300, it continues to synchronize binlog and apply from the main database, resulting in T2 being lost in the standby database.

Therefore, we must design a mechanism to ensure that the binlog write order of the Server layer is consistent with the transaction commit order of the storage engine layer.

Problem 3: performance degradation caused by writing redo and binlog at the same time

It is mentioned in question 1 that each transaction commit causes a performance problem, which becomes more serious with the introduction of binlog. Each transaction commit adds a file IO and needs to be flushed. If the system concurrency is high, then these IO will become a bottleneck that slows down the overall performance.

Solution

Problem 1:Redo log group submission technology

The technical idea of redo group commit is simple: by merging the brushing actions of multiple transaction redo log, the number of flushing times can be reduced. In Innodb's logging system, each redo log has a LSN (Log Sequence Number). When a transaction copies the log to redo log buffer, it gets the current maximum LSN, and the LSN is monotonously increasing, so it is guaranteed that the LSN of different transactions will not be duplicated. Then suppose that the maximum LSN of the logs of the three transactions Trx1, Trx2 and Trx3 are LSN1, LSN2 and LSN3 (LSN1 < LSN2 < LSN3), respectively, and they are committed at the same time. If trx3 takes the lead in executing the commit, it will ask to flush the disk to LSN3, so by the way, the redo log of Trx1 and Trx2 will also be brushed. Trx1 and Trx2 will determine that their LSN is less than the maximum LSN that has been removed, so there is no need to flush the disk again.

Question 2: internal XA transactions

When binlog is turned on, internal XA transactions are introduced to coordinate the upper layer and the storage engine layer, specifically, two phases are introduced when the transaction commits:

Prepare: refresh redo log to ensure that updates to data and undo pages have been flushed to disk, and set transaction status to PREPARE status

Commit:1). Write binlog and swipe disk, 2). Invokes the engine layer transaction commit interface. Set the transaction state to COMMIT.

The main purpose of such two-phase commit is to ensure the correctness of the database crash. Because once binlog is closed, it may be consumed by downstream nodes. Such transactions must be restarted by commit instead of rollback. On the other hand, for transactions for which the binlog is not down, it will be rolled back directly when the crash resumes.

Specifically, when the fault is restored, scan the last binlog file (in the flush phase, if the binlog size exceeds the threshold, the rotate binlog file will ensure that the last transaction recorded in the file must be committed), and extract the xid. Redo the redo log after the checkpoint, read the undo segment information of the transaction, collect the list of transactions in the prepare phase, compare the xid of the transaction with the xid recorded in the binlog, and commit if it exists, otherwise it will be rolled back.

Before MySQL5.6, in order to ensure that the write order of the database binlog was consistent with the transaction commit order of the InnoDB layer, prepare_commit_mutex locks were used inside the MySQL database.

Specifically, lock is added when the engine layer prepare is submitted in two phases, and the lock is released after the engine layer commit:

Innobase_xa_prepare () write () and fsync () binary loginnobase_commit ()

This does ensure that the transaction order of binlog and innodb is the same, but this lock causes all transactions to be serialized, and fsync is called at least multiple times per commit, which is inefficient. This is also a problem that needs to be discussed and solved next.

Question 4

Referring to the redo log optimization technology, the group commit technology is introduced to optimize the write performance of binlog.

Consider the transaction commit process when not optimized:

Prepare: this stage brushes the redo log of the engine layer (innodb) and sets the transaction status to PREPARED (update the transaction status on undo page). This stage does not involve binlog.

Commit: write binlog log and refresh disk, release lock and rollback segment at engine layer, set transaction status to COMMITTED, etc.

The so-called group submission technology is essentially a finer-grained split of time-consuming commit steps, specifically:

Divide the commit of step 2 into three phases:

Flush: write binlog, but not sync

Sync: call the fsync operation to drop the file to disk

Commit: call the storage engine API to submit the transaction

Fsync here is a time-consuming operation, so we want to save enough writes before making a fsync call, using batch technology here. The principle is that each stage in the above steps has a corresponding task linked list, and each thread entering this stage will add its own task to the linked list, and the linked list will be locked to ensure correctness. The first thread to join the list becomes Leader, and the subsequent thread becomes Follower. All the tasks in the linked list form a Batch, which is performed by Leader, while Follower waits for its task to be completed.

Once the linked list tasks of a certain stage are completed, these tasks will move on to the next stage, and they will also be added to the task list of that stage to repeat the above execution flow.

This design has the following advantages:

Using Leader execution instead of each thread's own execution can effectively reduce the number of calls such as write/fsync, and improve efficiency to ensure that transactions write binlog and engine layer commit in the same order that multiple transactions can be executed concurrently, without the need to be forced serialization by prepare_commit_mutex locks.

In addition, MYSQL also further optimizes the prepare stage brush redo log. The original design is multi-transaction can be concurrent to brush redo log, the same efficiency is not high enough. The redo log flushing disk of the prepare phase can be executed in the Flush phase of the commit phase. But there is a small problem to be explained is: before optimization, each thread is responsible for the release of its own redo log, and knows that the lsn of the redo log of flush is needed. If it is changed to the Flush stage by its Leader thread, it does not understand the lsn of each thread's redo log, so it simply and rudely flush to the maximum lsn of log_sys, which ensures that the redo log that commits the transaction must be removed.

After reading the above content, do you have a further understanding of how to optimize binlog in MYSQL? if you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.