Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What happened to the abnormal surge of binlog?

2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "what happened to binlog abnormal surge", which is easy to understand and well organized. I hope it can help you solve your doubts. Let me lead you to study and learn this article "what happened to binlog abnormal surge".

This is a problem encountered by a friend, and his phenomenon is roughly as follows (MySQL5.6):

The actual size of a binlog is about 8g, and the actual set size should be 1g.

It contains a large transaction, but the last transaction is a small transaction

The difference between the XID_EVENT ('commit') time of a large transaction and the XID_EVENT (' commit') time of the last small transaction is nearly 15 minutes.

Here is the basis he provided:

From the result of mysqlbinlog-vv mysqlbinlog_file > mysqlbinlog_res.loggrep-n-B1 "COMMIT" mysqlbinlog_res.log > file.logtar zcf file.log.tar.gz file.log, we can see that the line number of the binlog file for the completion of the large transaction is 392578997, and the line number after the completion of the transaction is 42614752, with a difference of 320 million lines. That is to say, the commit time of the binlog major transaction with a total of 320 million lines written by the transaction is 12COMMIT 54V12 XID = 4103492840. The pos value of the end is 2915555141. The commit time of the last transaction is 13 08. 43 Magi XID = 4104544654, and the POS value of the end is 2915740337.

Q:

Why is the last transaction a small transaction rather than the largest transaction, and why didn't you switch the binlog after the big one?

Why is there a 15-minute difference between the commit time of the last small transaction and the big transaction?

First, submit the flow chart

This picture is based on MySQL5.7.22:

All right, with this picture, we'll continue to analyze.

2. Why are major events included in a binlog

For example, in step 10 of the figure, we can see the flag that the binlog switch will not be performed until the transaction Event of the flush queue is written to binlog (not fsync). The implication is that no matter how big the transaction is, the switch flag will not be set until the binlog is written. So big business is always in the same binlog.

Third, why the last transaction is a small transaction rather than the biggest one

In fact, in step 10, we just set the switch flag, and the actual switch will wait until the commit queue where the transaction is submitted is completed. For more information, please refer to step 28.

During this period, there are two reasons why a big transaction is not the last transaction of binlog:

For flush queues, large transactions may be contained somewhere in the queue, and small transactions may be included behind the queue.

For sync queues, the commit of large transactions takes a lot of time during the sync phase. If we assume 30 seconds, then other new transactions can enter the new flush queue and write binlog (not fsync) during those 30 seconds.

So for online stressful libraries, the last transaction of binlog is usually not a big transaction.

4. Why is there a 15-minute difference in XID_EVENT (commit) time between the last small transaction and the big transaction?

First of all, there are two possibilities for this question:

For automatic transaction commit, then XID_EVENT will be the time when the command was initiated, so this situation is more likely to happen, which will be used later to prove it.

For showing the open transaction 'begin commit', then XID_EVENT will be the time when the commit command was initiated, but this problem can also occur if the fsync time is long enough. This situation is not easy to test because large enough data is required and human testing is time-consuming. Here are the reasons for this.

I have explained the reasons for the difference between the above two situations in lectures 12 and 14 in my "in-depth understanding of MySQL Master-Slave principle 32".

Here we assume that the commit of the big transaction took about 15 minutes during the sync phase, so it is as follows:

Big transaction flushT1

Large transaction sync starts T2 small transaction flushT2

Small transaction flushT3

Small transaction flushT4 big transaction sync end T5

If the difference between T5 and T2 is about 15 minutes, then the small transactions that come in during this period are still in this binlog (because you haven't switched 29 steps yet), then it's possible to see a big difference in XID_EVENT (commit) time between small transactions and big transactions.

In fact, the same last commit is likely to be generated in both cases in 5.7. because the operation of changing the last commit in step 22 cannot be carried out at this time due to the blockage of the big transaction fsync.

5. Test in 5.7.22

The whole testing process must be committed by a large transaction at this point in time. My parameters are set as follows:

Max_binlog_size:1048576, set a smaller binlog size to facilitate testing.

Binlog_group_commit_sync_delay:1000000, set this parameter to 1 second to lengthen the entire commit process for testing, but in fact, fsync operations for large transactions may be more time-consuming.

Binlog_transaction_dependency_tracking:COMMIT_ORDER, which is the default configuration, in order to better prove that we previously generated the same last commit conclusion, to avoid the interference of writeset.

And I set the breakpoint MYSQL_BIN_LOG::ordered_commit in my debug environment for better testing, otherwise it is very difficult to confirm when the transaction was committed automatically.

Finally, we do not use the open transaction shown through 'begin commit', because the time of the XID_EVENT is the time when the commit command was initiated, and it is not easy to reproduce the big difference between the XID_EVENT big transaction and the small transaction time in the case. But in fact, it is OK if the transaction is large enough, because there are hundreds of millions of data in a large transaction, such as a case, then the sync process of this transaction will be very slow, but my test environment does not have that much data, so I use autocommit in order to make the test more effective, so that all Event is the time when the command is initiated.

First I made a larger table with 70W of data, and then deleted the entire table. Obviously, the binlog of this transaction will be greater than 1m. The following table shows the operation flow:

T1:delete from testnnn; (70W rows of data)

T2: triggered by entering the breakpoint of the submission process

T3:delete from tm10; (1 row of data)

T4:delete from tmpk; (1 row of data) T5: all transactions commit completed

As long as the T4-T1 is long enough, then the situation in the case may occur. The following is a screenshot of my binlog. You can see that the binlog.000017 is about 3.5m:

Here is the last part of my analysis of binlog.000017. We can find that the last two transactions are small transactions, and the big transaction is not the last transaction as follows:

# DELETE FROM `testmts`.`testnnn` # # WHERE### @ 1testmts`.INT meta=0 nullable=1 is_null=0 * / # DELETE FROM `testmts`.testnnn` # WHERE### @ 1testmts`.testnnn` / # DELETE FROM `testmts`.testnnn` # # WHERE### @ 1 "10 / * INT meta=0 nullable=1 is_null=0 * / # # DELETE FROM `testmts`.testnnn` # WHERE### @ 1testmts`10 / * INT meta=0 nullable=1 is_null=0 * / # # DELETE FROM `testmts`.`testnnn` # WHERE### @ 1headed 10 / * INT meta=0 nullable=1 is_null=0 * / # DELETE FROM `testmts`.testnnn` # # WHERE### @ 1room10 / * INT meta=0 nullable=1 is_null=0 * / # DELETE FROM `testmts`. `testnnn` # WHERE### @ 1room10 / * INT meta=0 nullable=1 is_null=0 * / # # DELETE FROM `testmts`.`testnnnn` # WHERE### @ 1headed 10 / * INT meta=0 nullable=1 is_null=0 * / # at 36266170804 22:56 : 10 server id 413340 end_log_pos 3626648 CRC32 0xfc5b79e7 Xid = 143COMMIT / # at 3626648 "190804 23:02:26 server id 413340 end_log_pos 3626713 CRC32 0xa2399157 GTID last_committed=0 sequence_number=2 rbriches onlylyzed brands," 50718 SET TRANSACTION ISOLATION LEVEL READ committable databases, such as set @ @ SESSION.GTID_NEXT= 'cb7ea36e-670f-11e9-b483-5254008138e4display 191 "Universe" / # at 3626713 / 190804 23:02:26 server id 413340 end_log_pos 3626788 CRC32 0x555fb49d Query thread_id=4 exec_time=0 error_code=0SET timestamp 1564930946 / # at 3626788 "190804 23:02:26 server id 413340 end_log_pos 3626838 CRC32 0xec0a4316 Table_map: `testmts`.`tm10` mapped to number 14" at 3626838 "190804 23:02:26 server id 413340 end_log_pos 3626878 CRC32 0x61c79d68 Delete_rows: table id 149flags: STMT_END_F### DELETE FROM `testmts`.`tm10` # WHERE### @ 1 million 10 / * INT meta=0 nullable=1 is_null=0 * / # at 3626878 19804 23:02:26 server id 413340 end_log _ pos 3626909 CRC32 0x2a9cd136 Xid = 154COMMIT / # at 3626909 / 190804 23:02:26 server id 413340 end_log_pos 3626974 CRC32 0x06b081ec GTID last_committed=0 sequence_number=3 rbricks onlylyzed brands / / set @ @ SESSION.GTID_NEXT= 'cb7ea36e-670f-11e9-b483-5254008138e415 / / # at 3626974 / 190804 23:02:26 server id 413340 end_log_pos 3627049 CRC32 0x0e214995 Query thread_id=5 exec_time=1 error_code=0SET timestamp 1564930946 / # at 3627049 "190804 23:02:26 server id 413340 end_log_pos 3627104 CRC32 0x8ee0af93 Table_map: `testmts`.`tmpk` mapped to number 15" at 3627104 "190804 23:02:26 server id 413340 end_log_pos 3627154 CRC32 0x4804be49 Delete_rows: table id flags: STMT_END_F### DELETE FROM `testmts`.`tmpk` # WHERE### @ 1 / * INT meta=0 nullable=0 is_null=0 * / # # @ 2roomg' / * VARSTRING (60 ) meta=60 nullable=1 is_null=0 * / # @ 3room1 / * INT meta=0 nullable=1 is_null=0 * / # # @ 4room1 / * INT meta=0 nullable=1 is_null=0 * / # at 362715419804 23:02:26 server id 413340 end_log_pos 3627185 CRC32 0x64f2ea15 Xid = 153COMMITUniverse

If you look closely, you will find that there is a six-minute difference between 23:02:26 and 22:56:10. Then let's take a look at their last commit as follows:

[root@mysqltest2 log] # cat-n log.log | grep last 11 # 190804 22:56:10 server id 413340 end_log_pos 299 CRC32 0x47602f13 GTID last_committed=0 sequence_number=1 rbr_only=yes2167349 # 190804 23:02:26 server id 413340 end_log_pos 3626713 CRC32 0xa2399157 GTID last_committed=0 sequence_number=2 rbr_only=yes2167368 # 190804 23:02:26 server id 413340 end_log_pos 3626974 CRC32 0x06b081ec GTID last_committed=0 sequence_number=3 rbr_only=yes

We found that, as we described, their last commit is consistent. So far, all our conclusions have been proved.

The above is all the content of the article "what happened to the abnormal surge of binlog". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report