In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Overall overview: group submission (group commit) is an optimized way for MYSQL to deal with logs, mainly to solve the problem of frequent disk scrubbing when writing logs. Group submission has been continuously optimized with the development of MYSQL, from initially only supporting redo log group submission to the current official version 5.6 supporting both redo log and binlog group submission. The implementation of group commit greatly improves the transaction performance of mysql. Before mysql5.6, only redo group commit was supported before 5.6. when binlog is enabled, in order to ensure the consistency of redo and binlog, two-phase commit is used (first refresh redo, then refresh binlog, and judge whether the transaction is successful by the success of binlog refresh). In order to ensure that the order of binlog and commit is consistent, binlog uses serialization refresh. As a result, binlog is unable to group commit. 5.6optimizes this step and introduces queues to ensure that the binlog is in order, so that binlog can also submit in groups. Next, let's focus on one: redo group submission, and two: two-phase submission. 3: 5.6 binlog group commit; 1: redo group commit WAL (Write-Ahead-Logging) is a common technology to achieve transaction persistence, the basic principle is that when committing a transaction, in order to avoid random writing of the disk page, you only need to ensure that the redo log of the transaction is written to disk, which can be written sequentially by redo log instead of random writing of the page, and can ensure the persistence of the transaction and improve the performance of the database system. Although WAL uses sequential writes instead of random writes, each transaction commit still requires a log flush action, which is limited by disk IO, which is still the bottleneck of transaction concurrency. The idea of redo group commit is to merge the flushing actions of multiple transactions redo log to reduce the disk write order. In Innodb's log system, each redo log has a LSN (Log Sequence Number), and LSN is monotonously increasing. Each transaction performs an update operation that contains one or more redo log, and when each transaction copies the log to log_sys_buffer (log_sys_buffer is protected by log_mutex), it gets the current maximum LSN, so it is guaranteed that the LSN of different transactions will not be duplicated. Then assuming that the maximum LSN of the log of the three transactions Trx1,Trx2 and Trx3 is LSN1,LSN2,LSN3 (LSN1lsn), it must be guaranteed that the transaction redo log that will be brushed with binlog must have been closed. By delaying writing redo, the purpose of redo log group submission is realized, and the competition of log_sys- > mutex is reduced. At present, this strategy has been introduced by the official mysql5.7.6. 2: two-phase submission in a stand-alone case, redo log group submission solves the log disk problem very well, so when binlog is enabled, can binlog also enable group submission like redo log? After opening binlog first, one of the problems we need to solve is how to ensure the consistency of binlog and redo log. Because binlog is the bridge of Master-Slave, if the order is not consistent, it means that the Master-Slave may be inconsistent. MYSQL solves this problem well through two-phase commit. Prepare stage: innodb brushes redo log, and sets rollback segment to Prepared state, binlog does not do any operation; commit phase: innodb releases lock, releases rollback segment, sets submission status, binlog brushes binlog log. When an exception occurs and fault recovery is needed, if it is found that the transaction is in the Prepare phase and binlog exists, it will be committed, otherwise it will be rolled back. Through two-phase commit, the consistency of redo log and binlog is ensured in any case. With this problem, I would like to focus on the process of MySQL innodb engine transaction commit: in order to ensure the data consistency of master and slave, MySQL must ensure the consistency of binlog and InnoDB redo logs (because the standby database replays the transactions committed by the master database through the binary log, while the master library binlog is written before commit. If you finish writing the binlog master library crash, the transaction will be rolled back when you start again. However, if the slave database has been executed at this time, the master and backup data will be inconsistent). So after enabling Binlog, how to ensure the consistency of binlog and InnoDB redo logs? For this reason, MySQL introduces two-phase commit (two phase commit or 2pc), and MySQL automatically treats ordinary affairs as a XA transaction (internal distributed transaction): MySQL solves this problem well through two-phase commit (two-phase commit of internal XA). The key of two-phase commit is to ensure that the binloglog file can be written only after the redo is brushed. Before MySQL5.6, in order to ensure that the write order of the upper binary logs of the database was consistent with the transaction commit order of the InnoDB layer, prepare_commit_mutex locks were used inside the MySQL database. However, holding this lock will cause the group submission to fail. Back to the question in the previous section, how to implement the group submission on the basis of ensuring the consistency of the redo log-binlog after opening binlog. Because of this problem, mysql could not implement group submission with binlog enabled before 5.6. serialization of binlog through a notorious prepare_commit_mutex is only to ensure that redo log-Binlog is consistent, but this implementation sacrifices performance. This situation is obviously intolerable, so various mysql branches, mariadb,facebook,perconal and so on have issued patches to improve this problem, and the official version 5.6 of mysql has finally solved this problem. As the solutions of each branch version are similar, I mainly explain the implementation method by analyzing the implementation of 5.6. The basic idea of binlog group commit is to introduce a queue mechanism to ensure that the innodb commit order is consistent with the binlog order, and to group transactions, and the binlog disk brushing action in the group is handed over to a transaction to achieve the purpose of group commit. Binlog commit divides the submission into three phases, the FLUSH phase, the SYNC phase and the COMMIT phase. There is a queue in each stage, and each queue has a mutex protection. It is agreed that the first thread to enter the queue is leader, and the other thread is follower. Everything is left to leader to do. After leader has done all the actions, inform follower that the flushing ends. There are two phases before MySQL5.6: the first stage (preparation phase): InnoDB prepare, which holds prepare_commit_mutex, and write/sync redo log; sets the rollback segment to Prepared state, and binlog does nothing; the second stage (commit phase): Commit Phase contains two-step 1.write/sync Binlog (here, it can only be serialized to refresh in the same order as redo). 2.InnoDB commit (release prepare_commit_mutex, release rollback segment and set commit status after writing COMMIT flag); take whether binlog is written or not as a sign of transaction commit success, innodb commit flag is not a sign of transaction commit success or not. Because at this time the transaction crash recovery process is as follows: 1. When crash recovery, scan the last Binlog file, extract the xid; 2.InnoDB in it, maintain the transaction linked list with the state of Prepare, compare the xid of these transactions with the xid recorded in Binlog, if it exists in Binlog, commit, otherwise roll back the transaction. In this way, you can keep the transaction state consistent in InnoDB redo and Binlog. If the innodb commit flag crashes when it is written, the commit flag will be rewritten with redo log when it is restored; if there is no such xid in the binlog if it crashes in the prepare phase, it will be rolled back with undo; if it crashes in the write/sync binlog phase, it will also be rolled back with the help of undo. MySQL5.6 used prepare_commit_mutex and sync_log to ensure that the order of the binary log and the storage engine were consistent when opening Binary log. The performance of the prepare_commit_mutex lock machine was very poor when it was made to commit transactions with high concurrency and the binary log could not be group commit. The implementation of MySQL5.6 and later versions focuses on solving the problem of binlog group commit: the two-phase commit process of the above transaction is implemented before version 5.6 and has serious defects. When sync_binlog=1, it is clear that the write/sync binlog in the second phase above will become a bottleneck, and still hold a global big lock (prepare_commit_mutex: prepare and commit share a lock), which can lead to a sharp performance degradation. The workaround is the binlog group submission in MySQL5.6. The basic idea of binlog group commit is to introduce a queue mechanism to ensure that the innodb commit order is consistent with the binlog order, and to group transactions, and the binlog disk brushing action in the group is handed over to a transaction to achieve the purpose of group commit. Binlog commit divides the commit phase into three phases, the FLUSH phase, the SYNC phase and the COMMIT phase. There is a queue for each phase, the first transaction in the queue is called leader, and the other transactions are called follower,leader that control the behavior of follower. The process of the MySQL innodb engine transaction commit after MySQL5.6: the first stage (preparation phase): InnoDB prepare, holding the prepare_commit_mutex, and the write/sync redo log; sets the rollback segment to the Prepared state, and releases the prepare_commit_mutex,binlog after completion without doing anything The second phase (commit phase): the InnoDB commit phase has changed, splitting the Commit phase into three steps, with tasks for each phase assigned to a dedicated thread: FLUSH phase 1) holds the Lock_log mutex [leader holds Follower wait] 2) get a set of binlog (all transactions in the queue) 3) send binlog buffer to binlog O cache 4) notify the dump thread that this stage of the dump binlog SYNC phase is related to the parameter sync_binlog, 1) release Lock_log mutex, hold Lock_sync mutex [leader hold, follower wait] 2) unload a set of binlog (sync action is the most time-consuming, assuming sync_binlog is 1). COMMIT phase (there is no need to write redo log here, it has been written in prepare phase) 1) release Lock_sync mutex, hold Lock_commit mutex [leader hold, follower wait] 2) iterate through the transactions in the queue, innodb commit 3) release Lock_commit mutex 4) Wake up the waiting threads in the queue to sum up: FLUSH phase-refresh binlog from binlog buffer to binlog buffer O cache SYNC phase-move the binlog from the binlog O cache to the underlying disk; the COMMIT phase-innodb commit, to clear the undo information; each stage has its own queue. Each queue has its own mutex protection, and the queues are sequential. Only after the flush is completed can you enter the queue in the sync phase; after the sync is completed, you can enter the queue in the commit phase. However, the jobs of these three stages can be executed concurrently, that is, when a group of transactions are in the commit phase, other new transactions can be carried out in the flush phase, which realizes the real group commit.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.