In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly shows you the "MySQL standby library Seconds_Behind_Master calculation example analysis", the content is easy to understand, well-organized, hope to help you solve your doubts, the following let the editor lead you to study and learn the "MySQL standby library Seconds_Behind_Master calculation example analysis" this article.
Background
In the mysql master / slave environment, the master / slave synchronization process is as follows: the master database update generates binlog, and the slave database io thread pulls the master library binlog to generate relay log. The slave SQL thread executes relay log to keep synchronized with the main library.
Theoretically, when the master database is updated, there is a delay in the standby database, and the delay time is the standby database execution time + network transmission time, that is, t4-t2.
So how does mysql calculate the repository latency?
First, let's take a look at some information in show slave status. The io thread pulls the location of the main library binlog:
Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 107
Where the sql thread executes the relay log:
Relay_Log_File: slave-relay.000003 Relay_Log_Pos: 253
The location of the relay log executed by the sql thread relative to the main library binlog:
Relay_Master_Log_File: mysql-bin.000001 Exec_Master_Log_Pos: 107Source implementation
The source code implementation of Seconds_Behind_Master calculation is as follows:
If ((mi- > get_master_log_pos () = = mi- > rli- > get_group_master_log_pos ()) & (! strcmp (mi- > get_master_log_name (), mi- > rli- > get_group_master_log_name () {if (mi- > slave_running = = MYSQL_SLAVE_RUN_CONNECT) protocol- > store (0LL); else protocol- > store_null () } else {long time_diff= ((long) (time (0)-mi- > rli- > last_master_timestamp)-mi- > clock_diff_with_master); protocol- > store ((longlong) (mi- > rli- > last_master_timestamp? Max (0L, time_diff): 0);}
It can be roughly seen that it is calculated through time and site, which is analyzed in detail below.
The condition in if means that if the location where the io thread pulls the main library binlog is equal to the position of the relay log executed by the SQL thread relative to the main library binlog, then the delay is considered to be 0. In general, io threads are faster than sql threads. However, if the network condition is so poor that the SQL thread needs to wait for the io thread, then the two sites may be equal, causing the delay to be mistaken for 0.
Take a look at else:
Clock_diff_with_master
When the io thread starts, it sends the sql statement "SELECT UNIX_TIMESTAMP ()" to the master database to get the current time of the master database. However, subtracting this time from the current time of the standby library or the difference between the master and standby time is clock_diff_with_master. Here, if a user modifies the main database system time or the timestamp variable, then it is not accurate to calculate the standby database delay time.
Last_master_timestamp
Represents the time when the main library executed the binlog event. The calculation method of this time is different in parallel replication and non-parallel replication.
Non-parallel replication:
The slave sql thread reads the event,event in the relay log and updates the last_master_timestamp before it is executed, where the time is updated in event.
Rli- > last_master_timestamp= ev- > when.tv_sec + (time_t) ev- > exec_time
Ev- > when.tv_sec indicates the start time of the event. Exec_time refers to the execution time of the event in the main database. Only Query_log_event and Load_log_event count exec_time.
Another situation is that when the sql thread waits for the Io thread to get the binlog, it will set the last_master_timestamp to 0, according to the above algorithm, Seconds_Behind_Master is 0, and there is no delay in the task slave library.
Parallel replication:
Parallel replication has a distribution queue gaq,sql thread that reads binlog transactions to gaq and then distributes them to worker threads for execution. When replicating in parallel, binlog events are executed concurrently, and there is a checkpoint point in gaq called lwm. All binlog before lwm has been executed, while some binlog execution after lwm has not been executed.
Suppose the number of worker threads is 2, there are 8 transactions in gap: 1, 2, 3, 4, 5, 6, 7. The transaction executed by worker 1 is 1 46, and the transaction executed by woker 2 is 2 3, so the lwm is 4.
When you update the gap checkpiont in parallel, the lwm point is advanced and the time when the last_master_timestamp is the event at the end of the transaction where the lwm is located is updated. Therefore, parallel replication updates the last_master_timestamp after the transaction execution is completed, and the update is on a transaction-based basis. Updating gap checkpiont is also affected by the slave_checkpoint_period parameter.
This results in a gap in statistical latency between parallel replication and non-parallel replication, which may be the time that slave_checkpoint_period + transactions are executed in the standby database. This is why there is sometimes a small delay in parallel replication, but there is no delay when it is changed to non-parallel replication.
In addition, when the sql thread waits for the io thread and the gaq queue is empty, the last_master_timestamp is set to 0. Similarly, there is no delay at this time, and the seconds_Behind_Master is calculated to be 0.
Locus information maintenance
The point where the io thread pulls the binlog
Master_Log_File updates (process_io_rotate) Read_Master_Log_Pos when it reads the main library ROTATE_EVENT: each time the iothread fetches an event, it reads the pos information from the event and updates mi- > set_master_log_pos (mi- > get_master_log_pos () + inc_pos).
Where the sql thread executes the relay log
Update when Relay_Log_File sql thread processes ROTATE_EVENT (Rotate_log_event::do_update_pos) Relay_Log_Pos: when non-parallel replication, each statement performs completion update (stmt_done) when parallel replication, update when transaction is completed (Rotate_log_event::do_update_pos/ Xid_log_event::do_apply_event/stmt_done)
The location of the relay log executed by the sql thread relative to the main library binlog
Update when Relay_Master_Log_File sql thread processes ROTATE_EVENT (Rotate_log_event::do_update_pos) Exec_Master_Log_Pos and Relay_Log_Pos update at the same time non-parallel replication, each statement performs completion update (stmt_done) parallel replication, update on transaction completion (Rotate_log_event::do_update_pos/ Xid_log_event::do_apply_event/stmt_done)
When it comes to site updates, it is necessary to talk about two events: HEARTBEAT_LOG_EVENT and ROTATE_EVENT.
HEARTBEAT_LOG_EVENT
HEARTBEAT_LOG_EVENT the general purpose of our understanding is to send this event every master_heartbeat_period when the primary database is not updated to maintain the connection between the primary database and the standby database. Another function of HEARTBEAT_LOG_EVENT is that in gtid mode, some gtid backup libraries in the main library have been executed. At the same time, although these events no longer need to be executed by the standby library, the sites for reading and applying binglog still need to be advanced. Therefore, here we convert this kind of event into HEARTBEAT_LOG_EVENT, and HEARTBEAT_LOG_EVENT helps us to push the site.
ROTATE_EVENT
The ROTATE_EVENT generated by switching the main binlog, and the slave io thread will also switch the relay log when it is received. This rotate is also recorded that the relay log,sql thread executes ROTATE_EVENT to update only the locus information. The standby Io thread accepts the HEARTBEAT_LOG_EVENT of the main database and is generally not handled by the user. As mentioned earlier, in gtid mode, when the site of the HEARTBEAT_LOG_EVENT is larger than the currently recorded site, a ROTATE_EVENT is built to allow the sql thread to advance the site information.
If (mi- > is_auto_position () & & mi- > get_master_log_pos ()
< hb。log_pos && mi->Get_master_log_name ()! = NULL) {mi- > set_master_log_pos (hb. Log_pos); write_ignored_events_info_to_relay_log (mi- > info_thd, mi); / / build ROTATE_EVENT.}
In addition, when the replicate_same_server_id is 0 and the binlog received by the slave database is the same as that of the main database severid, the slave database will ignore the binlog, but the site still needs to be pushed. For efficiency, this binlog does not need to be credited to relay log. Instead, it is replaced with ROTATE_EVENT to push the site.
Delay phenomenon
The initial master / slave is synchronized and there are no updates. Suppose it takes 30s for both the master and slave libraries to execute a DDL, and 30s for a large update transaction (such as insert..select * from).
Network latency is not considered.
When non-parallel replication
When the DDL:t2 execution time is finished, the T2 time standby database executes the show slave status,Seconds_Behind_ Master value is 0. At the same time, T2 to T3 Seconds_Behind_Master increased to 30 in turn, and then fell by 0.
Execute big transactions: T2 when the main library is finished, T2 time standby database execution show slave status,Seconds_Behind_ Master value is 30. At the same time, T2 to T3 Seconds_Behind_Master increased to 60 in turn, and then fell by 0.
The reason for the above difference is that exec_time is counted only by Query_log_event and Load_log_event, and ordinary updates are not caused by statistics.
When parallel replication
When the DDL:t2 execution time is finished, the show slave status,Seconds_Behind_Master value of T2 to T3 standby database execution is always 0.
Execute big transactions: T2 when the main database is finished, T2 to T3 standby database execution show slave status,Seconds_Behind_Master value is always 0
This is because the master and standby are fully synchronized before executing the statement, and the gaq queue is empty, so the last_master_timestamp is set to 0. During the execution of the DDL, the gap checkpoint does not advance and the last_master_timestamp does not zero until the DDL or big transaction is completed.
So the Seconds_Behind_Master value from T2 to T3 is always 0. For a moment in T3, last_master_timestamp will be reset, but because slave_checkpoint_period will push the checkpoint,gaq queue to become empty, it will reset last_master_timestamp to 0.
So there may be instant delays in T3 (30s for DDL and 60s for major events).
This seems unreasonable. If the gaq queue is empty, last_master_timestamp will be set to 0, a rule that can actually be removed.
Related bug
BUG#72376, PREVIOUS_GTIDS_LOG_EVENT events are logged at the beginning of each binlog, representing the gtid collection of all previous files. The event record of relay-log itself is the time of the main library, but the PREVIOUS_GTIDS_LOG_EVENT event that begins with relay log is generated on the slave side, and the time is also subject to slave. Therefore, it is not possible to use this time to calculate last_master_timestamp. The fix is when relay log writes that the PREVIOUS_GTIDS_LOG_EVENT event is marked as generated by relay log, and when counting last_master_timestamp, it is found that the event generated by relay ignores the statistics.
If (is_relay_log) prev_gtids_ev. Set_relay_log_event ();. If (! (ev- > is_artificial_event () | |...)) Rli- > last_master_timestamp= ev- > when. Tv_sec + (time_t) ev- > exec_time; are all the contents of the article "sample Analysis of Seconds_Behind_Master calculation in MySQL". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.