Summary of mysql concurrent replication based on group submission 07/03 Update SLTechnology News&Howtos

Summary of mysql concurrent replication based on group submission

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

First: MySQL 5.7 parallel replication initial understanding We know that MySQL 5.7 parallel replication introduces two values last_committed and sequence_number. Last_committed indicates that when the transaction commits, the number of the last transaction commit, and the transaction committed at the same time on the main database is set to the same last_committed. If transactions have the same last_committed, they are all in a group and can be played back in parallel. Look at the number of transactions committed by the group, where 9892 is the value of last_commited, the same last_ commite value, and the following sequnen ce_number is a sequence number of transactions that can be replicated in parallel. The new binlog file starts at 1, and it doesn't matter to the xid of gtid! Cat binlog.log | grep GTID | grep last_committed | grep 19408: reasons for delay: table has no primary key, transaction is too large, parallelism is not enough, parameter setting is not appropriate, slave hardware is poor

Second, the parameters related to Mysql 5.7 parallel replication generally have too much master-slave delay, either without a primary key or unreasonable configuration, introduce the functions of these parameters in turn. And the setting path binlog_group_commit_sync_delay binlog_group_commit_sync_no_delay_count max_allowed_packet slave_max_allowed_packet slave_preserve_commit_order slave_pending_jobs_size_max 1: binlog_group_commit_sync_delay: PropertyValueCommand-Line Format--binlog-group-commit-sync-delay=#System Variablebinlog_group_commit_sync_delayScopeGlobalDynamicYesTypeIntegerDefault Value0Minimum Value0Maximum Value10000001) this parameter controls the number of microseconds to wait before the mysql group is submitted Setting this parameter can make the number of transactions in each group committed by the group more (because he waited n subtly), because the transactions in each group can be replayed in parallel from the library, so setting this parameter greater than 0 helps to improve the speed of applying logs from the library, and reducing the delay of setting binlog_group_commit_sync_delay from the library can increase the number of parallel committed transactions from the library. As a result, the ability of parallel execution on the replication secondary server can be increased. To benefit from this effect, the secondary server must set slave_parallel_type=LOGICAL_CLOCK, and the effect is more pronounced when binlog_transaction_dependency_tracking=COMMIT_ORDER is also set. When adjusting the settings of binlog_group_commit_sync_delay, you must consider both the throughput of the master device and the throughput of the slave device. 2) but setting binlog_group_commit_sync_delay increases the latency of transactions on the primary server, which may affect client applications. In addition, on highly concurrent workloads, contention may increase, resulting in reduced throughput, reduced database performance, and a lot of RECORD LOCK information can be seen in the error log. 3) when setting sync_binlog=0 or sync_binlog=1, the delay specified by binlog_group_commit_sync_delay acts on each transaction group commit. When sync_binlog is set to a value greater than 1, the delay of this parameter acts between every n binary log commit group commits Suggestion: this parameter setting must be combined with the actual situation, evaluate the concurrency amount of mysql, calculate the concurrency amount per unit time, and then reasonably set binlog_group_commit_sync_delay and binlog_group_commit_sync_no_delay_count (the number of transactions per group per unit time), and be sure to limit the number of transactions per group, because if you do not set it, if your actual concurrency is quite large. As a result, the number of transactions in each group may be very large, and then we know that the commit phase committed by the group is divided into three steps, and each stage has queues. If there are too many transactions in each group, memory queues will be insufficient, and temporary files will be used, which will reduce the performance of commit and cause more delays. So use binlog_group_commit_sync_no_delay_count to limit the number of transactions per group, so you can increase the number of transactions in the same group as much as possible, but not too much, limited to a reasonable range, and these transactions can be executed concurrently from the library, and if not set, many of these transactions may be serial. It is also possible that too many transactions lead to performance degradation and more latency! Extended parameter binlog_transaction_depandency_tracking: controls whether the detection can be replicated in parallel. Conflict detection based on primary key (binlog_transaction_depandency_tracking = COMMIT_ORDERE | WRITESET | WRITESET_SESSION). There is no conflict between the primary key or non-null unique key of the modified row. 5.7.22 also supports write-set mechanism transaction dependencies: binlog_transaction_depandency_tracking = COMMIT_ORDERE | WRITESET | WRITESET_SESSION COMMIT_ORDERE: continue to be based on group commit WRITESET: determine transaction dependencies based on write sets (problem! ) WRITESET_SESSION: based on the write set, but transactions in the same session will not have the same last_committed transaction detection algorithm: transaction_write_set_extraction = OFF | XXHASH64 | MURMUR32 MySQL will have a variable to store the committed transaction hash value. After hash, the primary key (or unique key) modified by all committed transactions will be compared with the set of that variable to determine whether the line change conflicts with it. And use this to determine the dependency variables mentioned here. You can set the size through this: the granularity of binlog_transaction_dependency_history_size reaches the row level. At this time, the granularity of parallelism is finer and the speed of parallelism is faster. In some cases, it is not too much to say that the parallelism of slave exceeds that of master. (master is a single-threaded write, and slave can also be played back in parallel.)

2: binlog_group_commit_sync_no_delay_countPropertyValueCommand-Line Format--binlog-group-commit-sync-no-delay-count=#System Variablebinlog_group_commit_sync_no_delay_countScopeGlobalDynamicYesTypeIntegerDefault Value0Minimum Value0Maximum Value1000000 this parameter must be used with binlog_group_commit_sync_delay to make sense, when the number of transactions already in delay reaches the number set by this parameter. It terminates the subtle delay of the binlog_group_commit_sync_delay parameter setting (you don't have to delay the subtle number of the parameter binlog_group_commit_sync_delay setting), that is, this parameter ensures that there are not too many transactions in the group committed by the same group!

3: slave_preserve_commit_orderPropertyValueCommand-Line Format--slave-preserve-commit-order [= {OFF | ON}] System Variableslave_preserve_commit_orderScopeGlobalDynamicYesTypeBooleanDefault ValueOFF1) for multithreaded slaves, setting 1 of this variable ensures that transactions are committed in the same order in the master library as they appear in the slave database relay log, and prevents gaps in the sequence of transactions executed in the slave library relay log. This variable has no effect on slaves that do not enable multithreading. Note that slave_preserve_commit_order=1 does not retain the order of non-transactional DML updates, so these updates may be committed before their previous transactions in the relay log, which may cause gaps; 2) slave_preserve_commit_order=1 requires-- log bin and-- log-slave-updates to be enabled on the slave, and slave_parallel_type is set to LOGICAL_CLOCK. Before changing this variable, all replication threads must be stopped (all replication channels if multiple replication channels are used) 3) with slave_preserve_commit_order enabled, the sql thread waits for all previous transactions to commit before committing, and when waiting from the thread for another worker thread to commit its transaction, it reports its status as: Waiting for preceding transaction to commit, so if this parameter is enabled Then it may aggravate the master-slave delay! 4) if slave_preserve_commit_order=0 is set, transactions of slave parallel applications may be committed out of order. Therefore, checking for recently executed transactions does not guarantee that all previous transactions from the autonomous server have been executed from the server (that is, what you are seeing from the library is not the current state, which is not caused by delay, but an inconsistent state caused by a different commit order). There may be gaps in the sequence of transactions executed in the relay log of the slave. This has an impact on the logging and recovery of slave libraries using multithreaded applications. 4: binlog_order_commitsPropertyValueCommand-Line Format--binlog-order-commits [= {OFF | ON}] System Variablebinlog_order_commitsScopeGlobalDynamicYesTypeBooleanDefault ValueON this parameter controls: when this variable is enabled on the replication host (this is the default setting), the transaction commit instruction issued to the storage engine uses a single thread serial commit so that transactions are always committed in the same order in which the binary log is written. Disabling this variable allows multiple threads to issue transaction commit instructions. When used in conjunction with binary log group commit, it prevents the commit rate of a single transaction from becoming a bottleneck in throughput, which may improve performance (master-slave settings) when all involved storage engines have confirmed that the transaction is ready to commit. The transaction is written to the binary log. The binary log group commit logic then commits a set of transactions after the binary log is written. When binlog_order_commits is disabled, transactions in the commit group may be committed in a different order than those in the binary log because the process uses multiple threads. (transactions from a single client are always committed in chronological order.) in many cases, this does not matter, because since transactions in group commits can be replicated in parallel, it means that the execution of these transactions separately will produce consistent results. Otherwise, parallel replication can only be executed by separate transactions; it is recommended that on some unimportant slave libraries, you can turn off this parameter to improve replication performance from the library! Fifth, about the size of packet packets: the purpose of 1.slave_pending_jobs_size_max: in multithreaded replication, the maximum memory occupied by Pending (waiting) events in the queue is 16m by default. If there is a surplus of memory or when the delay is large, it can be appropriately increased. Note that this value is larger than the max_allowed_packet of the main library, and this parameter needs to restart restart slave to take effect, that is, stop slave,start slave can take effect to view the max_allowed_packet parameters of the main library! Mysql > show variables like 'max_allowed_packet' -- 134217728, 128m +-+-+ | Variable_name | Value | +-+-+ | max_allowed_packet | 134217728 | +-+- -+ 2.max_allowed_packetPropertyValueCommand-Line Format--max-allowed-packet=#System Variablemax_allowed_packetScopeGlobal SessionDynamicYesTypeIntegerDefault Value4194304Minimum Value1024Maximum Value1073741824 this parameter controls mysql to limit the size of packets accepted by server It is important to note that it should be set to k, which is an integer multiple of positive 1024, because mysql under some servers may not be able to convert M to k for some reason. The default is 4m; the maximum is 1g. This parameter is set too high, which may lead to increased master-slave delay due to a transaction that is too large! Set global max_allowed_packet = 2 "1024" 1024 "10 (20m) 3.slave_max_allowed_packet parameter this variable sets the SQL of slave and the maximum big data packet size of I / O threads, and does not cause large row-based replication updates to fail because replication updates exceed the maximum number of big data packets allowed. Setting this variable immediately takes effect on all replication channels, including those that are running. The value of this global variable is always a positive integer multiple of 1024; if it is set to a non-positive integer multiple, the value is rounded down to the next maximum multiple of 1024 for storage or use; setting slave_max_allowed_packet to 0 uses 1024. (in all these cases, a truncation warning is issued. ) the default and maximum values are 1073741824 (1 GB), and the minimum value is 1024PropertyValueCommand-Line Format--slave-max-allowed-packet=#System Variableslave_max_allowed_packetScopeGlobalDynamicYesTypeIntegerDefault Value1073741824Minimum Value1024Maximum Value1073741824

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.