Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Analysis on the blocking problem of adding Cluster to the original main Library of semi-sync

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Woqu Technology Peng Xusheng

Some time ago, when supporting customers to handle problems, we found a semi-sync replication master-slave switch. When the original master joined the cluster, replication synchronization was blocked and data synchronization could not continue. It is of great reference significance. Please sort it out for your reference.

problem phenomenon

The customer manually switches in a semi-synchronous replication environment with one master and two slaves, and then tries to add the original repository to the cluster. As a result, it is found that the data in the new cluster cannot be synchronized to the slave (original repository). Check the synchronization status of the slave (original repository). The IO thread and SQL thread are in YES status, but Seconds_Behind_Master is greater than 0.

mysql> show slave status\G*************************** 1. row ***************************Slave_IO_State: Waiting for master to send event.............................................. Master_Log_File: mysql-bin.000007Read_Master_Log_Pos: 540Relay_Log_File: mysql-relay-bin.000006Relay_Log_Pos: 367Relay_Master_Log_File: mysql-bin.000007.............................................. Slave_SQL_Running_State: Waiting for semi-sync ACK from slave1 row in set (0.00 sec)

Check the status of show processlist and find that SQL thread has been Waiting for semi-sync ACK from slave, but there is no slave library under this slave (original library). Why wait for an ACK from slave?

mysql> show processlist;+----+-------------+-----------+------+---------+------+--------------------------------------+------------------+| Id | User | Host | db | Command | Time | State | Info |+----+-------------+-----------+------+---------+------+--------------------------------------+------------------+| 1 | system user || NULL | Connect | 540 || NULL || 2 | system user || NULL | Connect | 2191 | Waiting for master to send event | NULL || 4 | root | localhost | test | Query | 0 | starting | show processlist |+--------------------------------------

Starting with the semi-synchronous problem that SQL threads are waiting for, check semi-sync status and settings first

mysql> show global status like 'rpl_semi_sync%';+--------------------------------------------+-------+| Variable_name | Value |+--------------------------------------------+-------+| Rpl_semi_sync_master_clients | 0 || Rpl_semi_sync_master_net_avg_wait_time | 0 || Rpl_semi_sync_master_net_wait_time | 0 || Rpl_semi_sync_master_net_waits | 0 || Rpl_semi_sync_master_no_times | 0 || Rpl_semi_sync_master_no_tx | 0 ||| Rpl_semi_sync_master_timefunc_failures | 0 || Rpl_semi_sync_master_tx_avg_wait_time | 0 || Rpl_semi_sync_master_tx_wait_time | 0 || Rpl_semi_sync_master_tx_waits | 0 || Rpl_semi_sync_master_wait_pos_backtraverse | 0 || Rpl_semi_sync_master_wait_sessions | 0 || Rpl_semi_sync_master_yes_tx | 0 ||+--------------------------------------------+-------+15 rows in set (0.00 sec)

The semi-sync status Rpl_semi_sync_master_status=ON, Rpl_semi_sync_slave_status=ON can be found through the semi-sync status variables above. The strangest thing here is Rpl_semi_sync_master_status= ON.

According to the principle of semi-synchronous replication: the master database writes binlog when data changes occur, then waits for the slave database to receive and return ACK, and finally submits data at the storage engine layer. This is why the data after the change of the new master database cannot be found in the slave database (original master database). By default, semi-synchronous replication automatically degrades to asynchronous replication only if the wait for ACK exceeds the rpl_semi_sync_master_timeout setting.

Here the slave is considered a semi-synchronous master, but there is no slave connected to it, so it is waiting for an ACK from the slave. We look at the rpl_semi_sync_master_timeout variable value

mysql> show global variables like 'rpl_semi_sync%';+-------------------------------------------+------------+| Variable_name | Value |+-------------------------------------------+------------+||| rpl_semi_sync_master_trace_level | 32 || rpl_semi_sync_master_wait_for_slave_count | 1 || rpl_semi_sync_master_wait_no_slave | ON || rpl_semi_sync_master_wait_point | AFTER_SYNC || | rpl_semi_sync_slave_trace_level | 32 |+-------------------------------------------+------------+8 rows in set (0.00 sec)

Check the semi-sync parameter and find that rpl_semi_sync_master_enabled=ON, rpl_semi_sync_slave_enabled=ON, and rpl_semi_sync_master_timeout= 1000000 (10,000 seconds, default 10 seconds) are enabled at the same time from the library.

The customer unexpectedly set rpl_semi_sync_master_timeout to 100,000 seconds, which means that the original repository has to wait 100,000 seconds to automatically become asynchronous and join the synchronous data in the cluster. This is the root cause why slave(original repository) cannot continue synchronizing data from the cluster.

After communicating with the customer, the reason why the customer sets rpl_semi_sync_master_timeout to such a large value is to emphasize strong data consistency. He does not want to reduce the semi-synchronous replication structure to asynchronous replication under any circumstances, so as to ensure data consistency to the maximum extent.

principle analysis

The previous involves several sc replication parameters, some students may not know, the following gives you a brief explanation of MySQL semi-sync replication installation configuration and principle.

To enable semi-synchronous replication in MySQL version 5.7, install the semisync_master.so library on the master side and configure my.cnf

mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';mysql> set global rpl_semi_sync_master_enabled=ON;

Install the semisync_slave.so library on the slave side and configure my.cnf

mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so'; mysql> set global rpl_semi_sync_slave_enabled=ON; Rpl_semi_sync_master_status status variable on the master side shows ON status, indicating that the master has entered semi-synchronous replication mode. mysql> show global status like 'rpl_semi_sync_master_status';+-----------------------------+-------+| Variable_name | Value |+-----------------------------+-------+| Rpl_semi_sync_master_status | ON |+-------------------------------------+ At this time, Rpl_semi_sync_slave_status will be displayed on the slave side, which means that the slave enters semi-synchronous replication mode. mysql> show global status like 'rpl_semi_sync_slave_status';+-----------------------------+-------+| Variable_name | Value |+-----------------------------+-------+| Rpl_semi_sync_slave_status | ON |+-----------------------------+-------+

In semi-synchronous replication, every time the client commits a transaction in master, master MySQL writes the corresponding transaction to binlog, then waits for ACK returned by slave, then commits at storage engine layer, and finally returns a successful message to the client.

If slave does not return, master waits for the specified timeout period, after which it automatically drops to asynchronous replication mode.

If the master does not receive the specified number of ACKs from slaves (rpl_semi_sync_master_wait_for_slave_count receives the number of ACKs returned from slaves, default 1) within the timeout period (rpl_semi_sync_master_timeout timeout, default 10000 ms), the master automatically drops to asynchronous replication mode. When semi-synchronous replication is reduced to asynchronous replication mode, you can see Rpl_semi_sync_master_status=OFF on the master side and Rpl_semi_sync_slave_status=OFF on the save side.

The above master-wait behavior occurs when rpl_semi_sync_master_wait_no_slave=ON (default). If rpl_semi_sync_master_wait_no_slave is set to OFF and the number of slaves connected to master is less than the value set by rpl_semi_sync_master_wait_for_slave_count, master does not wait for timeout and immediately automatically drops to asynchronous replication mode.

recommendations

If rpl_semi_sync_master_timeout is set too high, synchronization blocks when the original repository joins the cluster during master-slave switching. It is recommended to set rpl_semi_sync_master_wait_no_slave=OFF.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report