In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
I encountered a somewhat strange problem today. There is a set of environments. There are some problems when copying from master to slave.
The general process design is as follows:
The three nodes are located in three different areas. Because there is a problem in the network between node 1 and node 3, node 2 is used to transfer. Therefore, delay is inevitable, but the delay cannot be too large. The final data is still subject to statistical analysis query through node 3. The amount of data in this environment is not large, but the data seems to change frequently. The colleague who developed in the morning reported that the node synchronization felt delayed greatly and wanted me to help find out what went wrong.
Looking at Node 1, Node 2 has no latency, the problem is the latency from Node 2 to Node 3.
View slave status in Node 3:
> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host:xxxx
Master_User: repl
Master_Port: 3307
Connect_Retry: 10
Master_Log_File: mysql-bin.000009
Read_Master_Log_Pos: 16186388
Relay_Log_File: relay-bin.000004
Relay_Log_Pos: 13599457
Relay_Master_Log_File: mysql-bin.000009
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
...
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table test_mbi.test_dist_online; Can't find record in 'test_dist_o
Skip_Counter: 0
Exec_Master_Log_Pos: 13599294
Relay_Log_Space: 16304336
Until_Condition: None
...
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table test_mbi.test_dist_online; Can't find record in 'test_dist_o
Replicate_Ignore_Server_Ids:
Master_Server_Id: 23307
Master_UUID: 189a00c4-16a3-11e6-a678-06c76b65c01e
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
1 row in set (0.00 sec)
Found error 1032 in log application, i.e. deleted data not found in slave repository. Generally speaking, this kind of problem seems to be small, so skip it and find that this is not an expedient measure, because after skipping this problem and then encountering the same problem, so repeatedly modifying skip itself is a matter of scratching the boot, and in fact the data is inconsistent.
Because of the urgent demand, tight time and large delay of data, it is found after simple evaluation that it is still necessary to rebuild the slave library.
Of course, this procedure is very routine. Let me also list briefly:
Because it is a multi-instance scene, the following command is used to export:
/opt/mysql/bin/mysqldump -S /data2/bmbidb/mysql.sock --single-transaction --master-data=2 -B test_ad test_mbi test_sys_mgr |gzip > test.sql.gz
Then, after working around various network levels, I finally copied this dump from node 2 to node 3 of the slave library environment.
Then stop slave at node 3 and start importing data:
gunzip
< test.sql.gz | /opt/mysql/bin/mysql --socket=/home/bmbidb/mysql.sock --port=3307 start slave 接着开始change master,当然这个时候对于MASTER_LOG_FILE,MASTER_LOG_POS可以通过dump来得到这些信息 gunzip < tes.sql.gz | head -50 会发现下面这么一段内容: -- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000008', MASTER_LOG_POS=241903809; 这就是需要我们关注的地方,然后直接使用即可。 CHANGE MASTER TO MASTER_HOST='xxxx',MASTER_USER='repl',MASTER_PASSWORD='xxxx',MASTER_PORT=3307,MASTER_LOG_FILE='mysql-bin.000008', MASTER_LOG_POS=241903809,MASTER_CONNECT_RETRY=10; 这样从库的设置就完成了。 然后在下午的晚些时间又碰到了类似的问题,这可让我很纠结了,不可能一出现这种情况我就重建从库吧。 排除了很多潜在的原因,包括sync_binlog,表结构差异,节点中的数据库权限,表的存储引擎等。貌似还是没有找到要领。 通过mysqlbinlog去解析relay日志,依旧是无功而返。 /opt/mysql/bin/mysqlbinlog -vv relaylog.05 --base64-output decode-rows >relay05.tmp
So it's still a tricky question.
With the assistance of colleagues, a temporary plan was used to make the transition first. For this type of DML operation, if the data does not exist, you can choose to ignore it, that is, set slave_exec_mode to IDEMPORT, and the default position STRICT
> set global slave_exec_mode='IDEMPOTENT';
Query OK, 0 rows affected (0.00 sec)
> stop slave;set global sql_slave_skip_counter=1;start slave;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
After the revision is completed, such problems will come to an end temporarily, and the root cause needs to be found. In this case, some of the data were compared and no other data conflicts were found, but the solution also required a reasonable explanation. Let's talk about this in the next post, and there should be an answer.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.