Explain the parallel replication of MySQL in depth and simplicity 07/03 Update SLTechnology News&Howtos

Explain the parallel replication of MySQL in depth and simplicity

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

I. the background of parallel replication

First of all, why is there the concept of parallel replication?

1. DBA should know that MySQL replication is based on binlog.

2. MySQL replication consists of two parts, the IO thread and the SQL thread.

3. The IO thread is mainly used to pull the binlog passed by Master and write it to relay log.

4. The SQL thread is mainly responsible for parsing relay log and applying it to slave

5. In any case, IO and SQL threads are single-threaded, while master is multithreaded, so there will inevitably be delays. In order to solve this problem, multithreading came into being.

6. IO multithreading?

6.1 IO there is no need for multithreading, because IO threads are not the bottleneck.

7. SQL multithreading?

7.1 that's right, the latest 5.6 slave 5.7 8.0 all implement multithreading on SQL threads to improve SQL concurrency.

Next, let's take a look at MySQL's efforts and achievements in parallel replication.

2. Key points

Whether it can be parallel or not, the key is whether there are lock conflicts between multiple transactions, which is the key. The following principle of parallel replication is to see how to avoid lock conflicts

3. Parallel replication of MySQL5.6 based on schema

Slave-parallel-type=DATABASE (transactions in different libraries, no lock conflicts)

As mentioned before, the purpose of parallel replication is to make slave run as multithreaded as possible. Of course, multithreading based on library level is also a way (transactions of different libraries, no lock conflicts).

First of all, let's talk about the advantages: it is relatively easy to implement and easy to use for users.

Let's talk about the disadvantages: because it is based on the library, the parallel granularity is very coarse, and now the architecture of many companies is one library and one instance. For such an architecture, 5.6 parallel replication is powerless. Of course, there is the order of master-slave transactions, which is also a big problem for 5.6.

Don't say much. Let's take a picture.

4. Parallel replication of MySQL5.7 based on group commit

Slave-parallel-type=LOGICAL_CLOCK: Commit-Parent-Based mode (the same set of transactions [last-commit], no lock conflicts. In the same group, there must be no conflict, otherwise there is no way to become the same group)

Slave-parallel-type=LOGICAL_CLOCK: Lock-Based mode (even if it is not the same set of transactions, as long as there are no lock conflicts between transactions [prepare phase], they can be concurrency. Not in the same group, as long as N transaction prepare phases can overlap, which means there is no lock conflict)

Group commit, which has been described in detail in the previous article, is not explained here. MySQL5.7 also marks the transactions of each group when the group commits, so now think about it for the convenience of MTS.

Let's look at a group of binlog first.

Last_committed=0 sequence_number=1last_committed=1 sequence_number=2last_committed=2 sequence_number=3last_committed=3 sequence_number=4last_committed=4 sequence_number=5last_committed=4 sequence_number=6last_committed=4 sequence_number=7last_committed=6 sequence_number=8last_committed=6 sequence_number=9last_committed=9 sequence_number=10

4.1 Commit-Parent-Based mode

4.2 Lock-Based mode

5. Parallel replication of MySQL8.0 based on write-set

Conflict detection based on primary key (binlog_transaction_depandency_tracking = COMMIT_ORDERE | WRITESET | WRITESET_SESSION. If there is no conflict between the primary key or non-null unique key of the modified row, it can run in parallel.

5.7.22 also supports the write-set mechanism

Transaction dependency: binlog_transaction_depandency_tracking = COMMIT_ORDERE | WRITESET | WRITESET_SESSION

COMMIT_ORDERE: continue to submit based on group

WRITESET: determining transaction dependencies based on write sets

WRITESET_SESSION: based on write sets, but transactions in the same session will not have the same last_committed

Transaction detection algorithm: transaction_write_set_extraction = OFF | XXHASH64 | MURMUR32

MySQL will have a variable to store the committed transaction Hash value, and the value of the primary key (or unique key) modified by the committed transaction will be compared with the collection of that variable after hash to determine whether the line change conflicts with it, and to determine the dependency.

For the variables mentioned here, you can set the size: binlog_transaction_dependency_history_size.

This granularity reaches the row level. At this time, the granularity of parallelism is finer and the speed of parallelism will be faster. In some cases, it is not too much to say that the parallelism of slave exceeds that of master (master is a single-threaded write, and slave can also be played back in parallel).

6. How to make the parallel replication of slave coincide with the execution order of master transactions?

After 5.7.19, you can set slave_preserve_commit_order = 1

Official explanation:

For multithreaded slaves, enabling this variable ensures that transactions are externalized on the slave in the same order as they appear in the slave's relay log.

Setting this variable has no effect on slaves for which multithreading is not enabled.

All replication threads (for all replication channels if you are using multiple replication channels) must be stopped before changing this variable.

-log-bin and-log-slave-updates must be enabled on the slave.

In addition-- slave-parallel-type must be set to LOGICAL_CLOCK.

Once a multithreaded slave has been started, transactions can begin to execute in parallel.

With slave_preserve_commit_order enabled, the executing thread waits until all previous transactions are committed before committing.

While the slave thread is waiting for other workers to commit their transactions it reports its status as Waiting for preceding transaction to commit.

The general principle of implementation is that the excecution phase can be executed in parallel, and when binlog flush, it is carried out sequentially. When the engine layer submits, it is completed according to the queuing order of binlog_order_commit.

In other words, if this parameter is set, how master is parallel, what is slave doing in parallel?

Summary

The above is the whole content of this article, I hope that the content of this article has a certain reference and learning value for your study or work, if you have any questions, you can leave a message and exchange, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.