In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
I. the background of parallel replication
First of all, why is there the concept of parallel replication?
1. DBA should know that MySQL replication is based on binlog.
2. MySQL replication consists of two parts, the IO thread and the SQL thread.
3. The IO thread is mainly used to pull the binlog passed by Master and write it to relay log.
4. The SQL thread is mainly responsible for parsing relay log and applying it to slave
5. In any case, IO and SQL threads are single-threaded, while master is multithreaded, so there will inevitably be delays. In order to solve this problem, multithreading came into being.
6. IO multithreading?
6.1 IO there is no need for multithreading, because IO threads are not the bottleneck.
7. SQL multithreading?
7.1 that's right, the latest 5.6 slave 5.7 8.0 all implement multithreading on SQL threads to improve SQL concurrency.
Next, let's take a look at MySQL's efforts and achievements in parallel replication.
2. Key points
Whether it can be parallel or not, the key is whether there are lock conflicts between multiple transactions, which is the key. The following principle of parallel replication is to see how to avoid lock conflicts
3. Parallel replication of MySQL5.6 based on schema
Slave-parallel-type=DATABASE (transactions in different libraries, no lock conflicts)
As mentioned before, the purpose of parallel replication is to make slave run as multithreaded as possible. Of course, multithreading based on library level is also a way (transactions of different libraries, no lock conflicts).
First of all, let's talk about the advantages: it is relatively easy to implement and easy to use for users.
Let's talk about the disadvantages: because it is based on the library, the parallel granularity is very coarse, and now the architecture of many companies is one library and one instance. For such an architecture, 5.6 parallel replication is powerless. Of course, there is the order of master-slave transactions, which is also a big problem for 5.6.
Don't say much. Let's take a picture.
4. Parallel replication of MySQL5.7 based on group commit
Slave-parallel-type=LOGICAL_CLOCK: Commit-Parent-Based mode (the same set of transactions [last-commit], no lock conflicts. In the same group, there must be no conflict, otherwise there is no way to become the same group)
Slave-parallel-type=LOGICAL_CLOCK: Lock-Based mode (even if it is not the same set of transactions, as long as there are no lock conflicts between transactions [prepare phase], they can be concurrency. Not in the same group, as long as N transaction prepare phases can overlap, which means there is no lock conflict)
Group commit, which has been described in detail in the previous article, is not explained here. MySQL5.7 also marks the transactions of each group when the group commits, so now think about it for the convenience of MTS.
Let's look at a group of binlog first.
Last_committed=0 sequence_number=1last_committed=1 sequence_number=2last_committed=2 sequence_number=3last_committed=3 sequence_number=4last_committed=4 sequence_number=5last_committed=4 sequence_number=6last_committed=4 sequence_number=7last_committed=6 sequence_number=8last_committed=6 sequence_number=9last_committed=9 sequence_number=10
4.1 Commit-Parent-Based mode
4.2 Lock-Based mode
5. Parallel replication of MySQL8.0 based on write-set
Conflict detection based on primary key (binlog_transaction_depandency_tracking = COMMIT_ORDERE | WRITESET | WRITESET_SESSION. If there is no conflict between the primary key or non-null unique key of the modified row, it can run in parallel.
5.7.22 also supports the write-set mechanism
Transaction dependency: binlog_transaction_depandency_tracking = COMMIT_ORDERE | WRITESET | WRITESET_SESSION
COMMIT_ORDERE: continue to submit based on group
WRITESET: determining transaction dependencies based on write sets
WRITESET_SESSION: based on write sets, but transactions in the same session will not have the same last_committed
Transaction detection algorithm: transaction_write_set_extraction = OFF | XXHASH64 | MURMUR32
MySQL will have a variable to store the committed transaction Hash value, and the value of the primary key (or unique key) modified by the committed transaction will be compared with the collection of that variable after hash to determine whether the line change conflicts with it, and to determine the dependency.
For the variables mentioned here, you can set the size: binlog_transaction_dependency_history_size.
This granularity reaches the row level. At this time, the granularity of parallelism is finer and the speed of parallelism will be faster. In some cases, it is not too much to say that the parallelism of slave exceeds that of master (master is a single-threaded write, and slave can also be played back in parallel).
6. How to make the parallel replication of slave coincide with the execution order of master transactions?
After 5.7.19, you can set slave_preserve_commit_order = 1
Official explanation:
For multithreaded slaves, enabling this variable ensures that transactions are externalized on the slave in the same order as they appear in the slave's relay log.
Setting this variable has no effect on slaves for which multithreading is not enabled.
All replication threads (for all replication channels if you are using multiple replication channels) must be stopped before changing this variable.
-log-bin and-log-slave-updates must be enabled on the slave.
In addition-- slave-parallel-type must be set to LOGICAL_CLOCK.
Once a multithreaded slave has been started, transactions can begin to execute in parallel.
With slave_preserve_commit_order enabled, the executing thread waits until all previous transactions are committed before committing.
While the slave thread is waiting for other workers to commit their transactions it reports its status as Waiting for preceding transaction to commit.
The general principle of implementation is that the excecution phase can be executed in parallel, and when binlog flush, it is carried out sequentially. When the engine layer submits, it is completed according to the queuing order of binlog_order_commit.
In other words, if this parameter is set, how master is parallel, what is slave doing in parallel?
Summary
The above is the whole content of this article, I hope that the content of this article has a certain reference and learning value for your study or work, if you have any questions, you can leave a message and exchange, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 293
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.