In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
Why there is a large number of delays in the main library occasionally in the case of MySQL double master single writing? I believe many inexperienced people are at a loss about this. Therefore, this article summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
We are double master and single write. Here, the library that is written is the master library, and the library that is not written is the slave library. Our falcon occasionally alarms as follows (very low frequency):
This is very strange, I am supposed to write alone from the library did not do anything (except for the application of Event), the main library where the delay, and the delay is so large. In my image, a friend asked this question, but I didn't study it carefully at that time.
II. Rules for delay calculation
We still need to look at the pseudo code for master-slave calculation delay:
/ * The pseudo code to compute Seconds_Behind_Master: if (SQL thread is running) / / if the SQL thread has started {if (SQL thread processed all the available relay log) / / if the SQL thread has applied all the Event {if (IO thread is running) written by the IO thread / / if the IO thread has started print 0 / set the delay to 0 else print NULL / / otherwise null} else compute Seconds_Behind_Master;// if the SQL thread has not applied all the Event written by the IO thread, then the calculation delay is required. } else print NULL;// is set to a null value if the SQL thread is not even started * /
The formula for calculating the delay is:
Long time_diff= ((long) (time (0)-mi- > rli- > last_master_timestamp)-mi- > clock_diff_with_master); that is: server current time-timestamp in Event header-time difference between master and slave server
Necessary conditions for delay:
If the SQL thread has not applied all the Event written by the IO thread, there is a difference between Read_Master_Log_Pos and Exec_Master_Log_Pos. The judging criteria are
(mi- > get_master_log_pos () = = mi- > rli- > get_group_master_log_pos ()) & (! strcmp (mi- > get_master_log_name (), mi- > rli- > get_group_master_log_name ()
Put aside the file name, that is, by comparing the location where the IO thread reads to the main library binary log and the main library binary log location applied by the SQL thread, as long as there is a difference between them, they will enter the delayed calculation process.
Server current time-timestamp in Event header-there must be a difference in the formula for the time difference between master and slave servers.
All right, let's look for the cause with these two necessary conditions for delay.
Third, the cause of the delay
1. Master library: first, the master library is written to the Event of the slave library, and the slave library will be written to binlog (log_slave_updates is enabled), and the DUMP thread of the slave library will be sent to the master database, but the I / O thread of the master library will filter the Event through the SERVER_ID process, and will not write to the relay log of the master library. At the same time, it will update the location (Read_Master_Log_Pos) read by the IO thread of the master library, and update the ignored location (rli- > ign_master_log_name_end [0]). The code is as follows:
If (! (s_id =: server_id & &! mi- > rli- > replicate_same_server_id) | | (event_type! = binary_log::FORMAT_DESCRIPTION_EVENT & & event_type! = binary_log::ROTATE_EVENT & & event_type! = binary_log::STOP_EVENT) {mi- > set_master_log_pos (mi- > get_master_log_pos () + inc_pos) / / add the Read_Master_Log_Pos locus to the current location memcpy (rli- > ign_master_log_name_end, mi- > get_master_log_name (), FN_REFLEN); / / copy DBUG_ASSERT (rli- > ign_master_log_name_end [0]); / / assert that there is rli- > ign_master_log_pos_end= mi- > get_master_log_pos (); / / ignore to the locus}
Main library: SQL thread determines whether there is an Event to be skipped by rli- > ign_master_log_name_end [0]. If so, build a Rotate_log_event to skip the Event. The code is as follows:
If (rli- > ign_master_log_name_end [0]) / / if the skipped Event exists {/ * We generate and return a Rotate, to make our positions advance * / DBUG_PRINT ("info", ("seeing an ignored end segment")) Ev= new Rotate_log_event (rli- > ign_master_log_name_end, 0, rli- > ign_master_log_pos_end, exec_relay_log_event Rotate_log_event::DUP_NAME); / / build a Rotate Event at rli- > ign_master_log_name_end [0] = 0 / / rli- > ign_master_log_pos_end, execute this Event to mysql_mutex_unlock (log_lock); exec_relay_log_event / / to update the Exec_Master_Log_Pos locus if (unlikely (! ev)) {errmsg= "Slave SQL thread failed to create a Rotate event" (out of memory?), SHOW SLAVE STATUS may be inaccurate " Goto err;} ev- > server_id= 0; / / don't be ignored by slave SQL thread DBUG_RETURN (ev);}
Okay, here we know how Event skips in the main library, but note that there may be a certain time difference between IO threads and SQL threads when dealing with Read_Master_Log_Pos and Exec_Master_Log_Pos, then the condition that there is a certain difference between Read_Master_Log_Pos and Exec_Master_Log_Pos may be met, and then enter the delay calculation.
The SQL thread of the main library does not usually read Event because all Event is filtered out by the IO thread. therefore
The timestamp in the header of Event is not updated (MTS). However, if you switch from the library binlog, the slave library will at least pass the ROTATE_EVENT to the master library, and the master library will get the actual Event, so the timestamp in the header of the Event will be updated. If you happen to encounter the difference between the Read_Master_Log_Pos and Exec_Master_Log_Pos of the IO thread of the main library
Then falcon to check the delay will get the illusion of a large delay, and the calculation formula of the delay will be as follows:
The current time of the master library-the time of the last binlog switch from the slave library-the difference between master and slave time
The difference between MTS and single thread
The third point above only applies to MTS. Different from a single SQL thread, the last_master_timestamp is set to 0. The code is as follows:
If (! rli- > is_parallel_exec ()) rli- > last_master_timestamp= 0
By implication, the formula for calculating delay for a single SQL thread is:
The current time of the master library-0: 00 on January 1, 1970-the difference between master and slave time
So it looks like the calculated delay will be greater.
Finally, it is important to note that in fact, there is no problem with the delay of this situation, it is completely an occasional computational problem, and it is an illusion, and the greater the pressure on the main library, the more likely it will be, because the IO thread and SQL thread are more likely to deal with the time difference between Read_Master_Log_Pos and Exec_Master_Log_Pos.
4. Delayed debug under MTS
In fact, knowing the principle is very easy to debug, because we can put the breakpoint on the show_slave_status_send_data function of the main library, then we can see that the operation is as follows:
From library flush binary logs
The main library performs some insert operations
Main library show slave status
At this time, we can skip the condition (there is a certain difference between Read_Master_Log_Pos and Exec_Master_Log_Pos) and calculate it directly through the formula, and get the following results:
(gdb) p (long) (time (0)-mi- > rli- > last_master_timestamp)-mi- > clock_diff_with_master$6 = 37
The delay is 37 seconds, so our theory has been tested.
The following debug result is a single SQL thread, and you can see that the latency is ridiculously large.
(gdb) p (long) (time (0)-mi- > rli- > last_master_timestamp)-mi- > clock_diff_with_master$7 = 1592672402 V, other questions
Additional questions:
If double master and double write
S1S2
T1T2
T3
If you follow the above theory, then the updated position of T3 may be reset by the location of T2 transaction. Because the SQL thread of the main library may have the possibility of Exec_Master_Log_Pos retrogression through the built Rotate_log_event, this is obviously not possible. But the logic for building Rotate_log_event in the code is wrapped under the following logic.
If (! cur_log- > error) / * EOF * / / the current relay log has finished reading {/ * On a hot log, EOF means that there are no more updates to process and we must block until thread adds some and signals us to continue O thread adds some and signals us to continue * / if (hot_log) / / if it is the current relay log
We can see that the Rotate_log_event will not be built until the current relay log read is complete. Therefore, this problem does not exist.
Although the problem above does not build Rotate_log_event, but if rli- > ign_master_log_name_end [0] is retained all the time, then when the relay log application is completed, it will still build Rotate_log_event and cause Exec_Master_Log_Pos retrogression. In fact, this problem will not occur, because it will be reset every time the I / O thread Event is written to relay log, as follows:
Rli- > ign_master_log_name_end [0] = 0; / / after reading the above, do you understand why there is a lot of delay in the main library in the case of MySQL dual master single write? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.