Detailed introduction of three replication mechanisms in mysql: asynchronous replication, semi-synchronous replication and parallel replication 07/01 Update SLTechnology News&Howtos

Detailed introduction of three replication mechanisms in mysql: asynchronous replication, semi-synchronous replication and parallel replication

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Next, let's learn about the three replication mechanisms in mysql: asynchronous replication, semi-synchronous replication and parallel replication. I believe you will benefit a lot after reading it. There are not many words in the text. I hope that the three replication mechanisms in mysql will replicate asynchronously, semi-synchronous replication and parallel replication.

* * # Asynchronous replication

Asynchronous replication is the most primitive replication method that comes with MySQL. After the replication relationship is successfully established between the master database and slave database, an IO thread on the slave database will pull the binlog to the master database and send the binlogx locally, which is the Relaylog in the figure below, and then the slave database will open another SQL thread to retrieve and play Relaylog, thus achieving the purpose of Master-Slave data synchronization.

In general, slave is read-only and can bear part of the read traffic, and one or more slave can be added according to the actual needs, which can alleviate the reading pressure of the main database to a certain extent. On the other hand, if Master is abnormal (crash, hardware failure, etc.) and cannot provide external services, Slave can assume the important task of master and avoid the generation of a single point, so replication is created for disaster recovery and performance improvement. * * semi-synchronous replication * 1. Concept * * in general, asynchronous replication is sufficient, but because it is asynchronous replication, the slave database is likely to lag behind the main database, especially in extreme cases, we cannot guarantee that the master and slave data are strictly consistent (even if we observe that the value of Seconds Behind Master is 0). For example, when a user initiates a commit command, Master does not care about the execution status of slave and returns it to the user immediately after a successful execution. Just imagine, if a transaction is committed and the master is successfully returned to the user after the crash, and the binlog of the transaction has not been passed to the slave, then the slave has one less transaction than the master, and the master / slave is inconsistent. It is not acceptable for businesses that require strong consistency, and semi-synchronous replication is created to solve data consistency. Why is it called semi-synchronous replication? Let's talk about synchronous replication first. Synchronous replication means that a transaction is returned to the user for successful execution after both master and slave are executed. The core here is that master and slave are either executed or not executed, involving 2pc (2 phrase commit). MySQL only implements the 2PC of local redo-log and binlog, but does not implement the 2PC of master and slave, so it is not strictly synchronous replication. MySQL semi-synchronous replication does not require slave to perform, but simply notifies master that the log can be returned after receiving the log. The key point here is whether the slave will execute after accepting the log. If it is notified to master after execution, it will be synchronous replication. If only the log is accepted successfully, it will be semi-synchronous replication. How to implement semi-synchronous replication? The key point of semi-synchronous replication implementation is master's special handling of the transaction commit process. At present, there are mainly two modes to realize semi-synchronous replication, AFTER_SYNC mode and AFER_COMMIT mode. The main difference between the two methods is whether to wait for slave's ACK after the storage engine commits. 2. AFTER_COMMIT mode first take a look at the AFTER_COMMIT mode. Start and End represent the point in time that the user initiates the commit command and the master returns to the user, respectively. The middle part is what master and slave do in the whole commit process. When master commits, it first brushes the redo log of the transaction to disk (here actually involves the problem of two-phase commit), and then enters the Inodb commit process, which mainly releases the lock and marks the transaction as the commit status (other users can see the update of the transaction). After this process is completed, wait for slave to send ack message, wait until slave response, master successfully returns to the user, master and slave synchronization logic It is the guarantee of master-slave consistency. 3, AFTER_SYNC mode compared with AFTER_COMMIT, Master in AFTER_SYNC mode, after fsync binlog, starts to wait for slave synchronization, then after step 5 innodbcommit, that is, when other transactions can see the update of the transaction, slave has successfully received binlog, even if a switch occurs, slave has the same data as master, and there will be no "phantom reading" phenomenon. But for the first case described above, the result is the same. Therefore, in extreme cases, a transaction in a semi-synchronous replicated master-slave will be inconsistent, but for the user, since the transaction is not successfully returned to the user, it is acceptable regardless of whether the transaction is committed or not, and it is necessary for the user to query or retry to determine whether the update is successful. Or let's think about it. For a stand-alone machine, if the network is down when the transaction is successfully executed and returned to the user, the user will also face the same problem. Therefore, this is not a problem of semi-synchronous replication. For transactions returned successfully by commit, semi-synchronous replication ensures that the master-slave must be consistent. From this point of view, semi-synchronous replication will not lose data, which can ensure that master-slave is mandatory.

Parallel replication

Semi-synchronous replication solves the Master-slave strong consistency problem, so what about the performance problem? The two threads involved in the replication, the IO thread and the SQL thread, are used to pull and play back the binlog, respectively. As far as slave is concerned, all the actions of pulling and parsing binlog are serial, and the user requests are processed concurrently with master. Under high load, if master generates binlog faster than slave consumes binlog, resulting in slave delay, you can see that the pipeline between users and master is much larger than that between master and salve.

So how do you parallelize, parallel io threads or parallel sql threads? Both can be done in parallel, but parallel sql threads are more profitable because sql threads do more (parsing, executing). Parallel IO threads can be divided into two threads to pull and write relay log from master, while parallel sql threads can achieve library-level parallelism, table-level parallelism and transaction-level parallelism as needed. Library-level parallelism has been implemented in the official version 5.6 of MySQL. The parallel replication framework actually contains a coordinator thread and several worker threads. The orchestrating thread is responsible for distributing and resolving conflicts, while the worker thread is only responsible for execution. Transactions of DB1,DB2 and DB3 can be executed concurrently, improving the performance of replication. Sometimes library-level concurrency may not be enough, requiring table-level concurrency, or finer-grained transaction-level concurrency. * * how does parallel replication handle conflicts? * * the concurrent world is beautiful, but you can't go hand in hand, or the data will be messed up. Master uses locking mechanism to ensure that concurrent transactions are carried out in an orderly manner. What about parallel replication? Slave must ensure that the order of playback is the same as that of transactions on master, so as long as the binlog is read sequentially, the transactions that do not conflict can be executed concurrently. For library-level concurrency, the coordination thread should ensure that the transactions executing the same library are executed in a worker thread; for table-level concurrency, the coordination thread should ensure the serial execution of transactions of the same table; for the transaction level, it is to ensure the serial execution of transactions that operate on the same line. * * is the finer the granularity, the better the performance? **

This is not certain. Compared with serial replication, parallel replication has one more coordinating thread. An important role of coordination thread is to resolve conflicts. The finer the concurrency is, there may be more conflicts, which may eventually be executed serially, but it consumes a lot of conflict detection costs.

After reading this article on three replication mechanisms in mysql: asynchronous replication, semi-synchronous replication and parallel replication, many readers will want to know more about it. For more industry information, you can follow our industry information section.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.