Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The slow commit of mysql big things causes the problem of full database jam.

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Please indicate the source of the original reprint

This article uses the engine INNODB version of MYSQL5.7.13

Sync_binlog = 1

Innodb_flush_log_at_trx_commit = 1

That is, double one setting.

In this paper, we use infobin, a homemade tool for trace,gdb on mysql, and

Select * from sys.session where command'SLEEP'\ G is complete.

It's just a write problem involving binlog that has nothing to do with the innodb layer.

1. The origin of the problem

In the online environment, a colleague deleted 1000W of data at one time, and at commit, the entire database

There are about tens of seconds of all hang living conditions, this article is for in-depth analysis of this problem.

2. Knowledge that may be required

When does http://blog.itpub.net/7728585/viewspace-2133589/ MYSQL synchronize the binlog file and where are its temporary files?

Http://blog.itpub.net/7728585/viewspace-2133188/ parsing MYSQL BINLOG binary format (1)-- preparation

Http://blog.itpub.net/7728585/viewspace-2133189/ parsing MYSQL BINLOG binary format (2)-- FORMAT_DESCRIPTION_EVENT

Http://blog.itpub.net/7728585/viewspace-2133321/ parsing MYSQL BINLOG binary format (3)-- QUERY_EVENT

Http://blog.itpub.net/7728585/viewspace-2133429/ parsing MYSQL BINLOG binary format (4)-- TABLE_MAP_EVENT

Http://blog.itpub.net/7728585/viewspace-2133463/ parsing MYSQL BINLOG binary format (5)-- WRITE_ROW_EVENT

Http://blog.itpub.net/7728585/viewspace-2133469/ parsing MYSQL BINLOG binary format (6)-- UPDATE_ROW_EVENT/DELETE_ROW_EVENT

Http://blog.itpub.net/7728585/viewspace-2133502/ parsing MYSQL BINLOG binary format (7)-- Xid_log_event/XID_EVENT

Http://blog.itpub.net/7728585/viewspace-2133506/ parses MYSQL BINLOG binary format (8)-- GTID_LOG_EVENT/ANONYMOUS_GTID_LOG_EVENT and others

Http://blog.itpub.net/7728585/viewspace-2133534/ parses MYSQL BINLOG binary format (9)-- infobin parses binlog help documents

Http://blog.itpub.net/7728585/viewspace-2133537/ parses MYSQL BINLOG binary format (10)-- question and answer

3. Briefly explain what was done in commit and why it was locked.

In this article, there is the following figure

In fact, in this picture, I have already written that I need a lock, but I didn't say it in detail. This lock affects all the statements that need to be written in binlog in all the current databases. We already know that during the commit of big things (exceeding), we need to temporarily

All the binlog event in the file is copied to binlog, and the event of a thing must be contiguous, that is, the event of one thing cannot contain the event of other things. So during this period, you need a mutex lock, why?

Without this lock, the same file is modified concurrently by multiple threads / processes, that is, the binlog file here, which is a way to protect critical resources. Without this lock, there will be confusion in the files in binlog.

Note that the lock belongs to the mysql layer, not the innoDB layer.

4. Simulation method

Delete the table test and do not submit it, and observe the temporary file size.

Reference to the principle of generating temporary files:

When does http://blog.itpub.net/7728585/viewspace-2133589/ MYSQL synchronize the binlog file and where are its temporary files?

Insert a piece of data into the testsort and testsort2 tables immediately during the deletion of the delete test table commit

At this time, you can observe that these two inserts will be blocked and the delete test will be completed.

Just finished.

Mysql > select count (*) from test

+-+

| | count (*) |

+-+

| | 2621440 |

+-+

1 row in set (3.14 sec)

And as follows:

You can see that it took more than 10 seconds for a simple insert and 14 seconds for commit. The reason is that the temporary file MLq9INFu needs to be synchronized with the binlog process.

Other sessions that need to be written to binlog are blocked.

During the deletion, you will see that the temporary files are growing as follows:

[root@testmy ~] # lsof | grep / tmp/ML

Mysqld 2095 mysql 84u REG 8pm 3 328560640 14680133 / tmp/MLq9INFu (deleted)

[root@testmy ~] # lsof | grep / tmp/ML

Mysqld 2095 mysql 84u REG 8pm 3 346337280 14680133 / tmp/MLq9INFu (deleted)

[root@testmy ~] # lsof | grep / tmp/ML

Mysqld 2095 mysql 84u REG 8pm 3 367812608 14680133 / tmp/MLq9INFu (deleted)

When you get to commit, the file becomes 0, but it still exists because the session thread is still there:

[root@testmy ~] # lsof | grep / tmp/ML

Mysqld 2095 mysql 84u REG 8pm 3 0 14680133 / tmp/MLq9INFu (deleted)

5. Use infobin to verify the binlog file.

Use. / infobin test.000211 > log.log for analysis

(you can use mysqlbinlog but not intuitive

Tools can be referred to

Http://blog.itpub.net/7728585/viewspace-2133534/ parses MYSQL BINLOG binary format (9)-- infobin parses binlog help documents

Obtain)

-binlog event of delete from test;

> Gtid Event:Pos:368030739 (0X15efb413) N_pos:368030804 (0X15efb454) Time:1487197597 Event_size:65 (bytes)

Gtid:4a6f2a67-5d87-11e6-a6bd-0c29a879a3:1158184

-- > Query Event:Pos:368030804 (0X15efb454) N_Pos:368030876 (0X15efb49c) Time:1487197355 Event_size:72 (bytes)

Exe_time:0 Use_db:test Statment (35b-trun): BEGIN / *! Trx beginnings / Gno:1158184

-> Map Event:Pos368030876 (0X15efb49c) N_pos:368031047 (0X15efb547) Time:1487197355 Event_size:171 (bytes)

TABLE_ID:129 DB_NAME:test TABLE_NAME:test Gno:1158184

-> Delete Event:Pos:368031047 (0X15efb547) N_pos:368039196 (0X15efd51c) Time:1487197355 Event_size:8149 (bytes)

Dml on table: test.test table_id:129 Gno:1158184

.... (middle omitted)

-> Delete Event:Pos:736052116 (0X2bdf4394) N_pos:736054135 (0X2bdf4b77) Time:1487197355 Event_size:2019 (bytes)

Dml on table: test.test table_id:129 Gno:1158184

> Xid Event:Pos:736054135 (0X2bdf4b77) N_Pos:736054166 (0X2bdf4b96) Time:1487197597 Event_size:31 (bytes)

COMMIT; / *! Trx end*/ Gno:1158184

-binlog of insert into testsort values (1);

> Gtid Event:Pos:736054166 (0X2bdf4b96) N_pos:736054231 (0X2bdf4bd7) Time:1487197599 Event_size:65 (bytes)

Gtid:4a6f2a67-5d87-11e6-a6bd-0c29a879a3:1158185

-- > Query Event:Pos:736054231 (0X2bdf4bd7) N_Pos:736054303 (0X2bdf4c1f) Time:1487197599 Event_size:72 (bytes)

Exe_time:0 Use_db:test Statment (35b-trun): BEGIN / *! Trx beginnings / Gno:1158185

-> Map Event:Pos736054303 (0X2bdf4c1f) N_pos:736054358 (0X2bdf4c56) Time:1487197599 Event_size:55 (bytes)

TABLE_ID:187 DB_NAME:test TABLE_NAME:testsort Gno:1158185

-> Insert Event:Pos:736054358 (0X2bdf4c56) N_pos:736054410 (0X2bdf4c8a) Time:1487197599 Event_size:52 (bytes)

Dml on table: test.testsort table_id:187 Gno:1158185

> Xid Event:Pos:736054410 (0X2bdf4c8a) N_Pos:736054441 (0X2bdf4ca9) Time:1487197599 Event_size:31 (bytes)

COMMIT; / *! Trx end*/ Gno:1158185

-binlog of insert into testsort2 values (1);

> Gtid Event:Pos:736054441 (0X2bdf4ca9) N_pos:736054506 (0X2bdf4cea) Time:1487197600 Event_size:65 (bytes)

Gtid:4a6f2a67-5d87-11e6-a6bd-0c29a879a3:1158186

-- > Query Event:Pos:736054506 (0X2bdf4cea) N_Pos:736054578 (0X2bdf4d32) Time:1487197600 Event_size:72 (bytes)

Exe_time:0 Use_db:test Statment (35b-trun): BEGIN / *! Trx beginnings / Gno:1158186

-> Map Event:Pos736054578 (0X2bdf4d32) N_pos:736054634 (0X2bdf4d6a) Time:1487197600 Event_size:56 (bytes)

TABLE_ID:188 DB_NAME:test TABLE_NAME:testsort2 Gno:1158186

-> Insert Event:Pos:736054634 (0X2bdf4d6a) N_pos:736054686 (0X2bdf4d9e) Time:1487197600 Event_size:52 (bytes)

Dml on table: test.testsort2 table_id:188 Gno:1158186

> Xid Event:Pos:736054686 (0X2bdf4d9e) N_Pos:736054717 (0X2bdf4dbd) Time:1487197600 Event_size:31 (bytes)

COMMIT; / *! Trx end*/ Gno:1158186

You can see that the event of insert into testsort values (1) and insert into testsort2 values (1); did not enter until after delete.

Binlog file

6. Analyze the waiting point.

So what are you waiting for? This is the sys.session output from another test.

Mysql > select * from sys.session where command'SLEEP'\ G

* * 1. Row *

Thd_id: 28

Conn_id: 3

User: root@localhost

Db: test

Command: Query

State: starting

Time: 13

Current_statement: commit

Statement_latency: 12.55 s

Progress: NULL

Lock_latency: 0 ps

Rows_examined: 0

Rows_sent: 0

Rows_affected: 0

Tmp_tables: 0

Tmp_disk_tables: 0

Full_scan: NO

Last_statement: NULL

Last_statement_latency: NULL

Current_memory: 179.02 KiB

Last_wait: wait/io/file/sql/binlog

Last_wait_latency: Still Waiting

Source: binlog.cc:8469

Trx_latency: 3.22 m

Trx_state: ACTIVE

Trx_autocommit: NO

Pid: 5821

Program_name: mysql

* 2. Row * *

Thd_id: 3031

Conn_id: 3006

User: root@localhost

Db: test

Command: Query

State: query end

Time: 12

Current_statement: insert into testsort3 (id1,id2,... OR ((RAND () * 100000)), 'gaopeng')

Statement_latency: 11.82 s

Progress: NULL

Lock_latency: 29.18 ms

Rows_examined: 0

Rows_sent: 0

Rows_affected: 0

Tmp_tables: 0

Tmp_disk_tables: 0

Full_scan: NO

Last_statement: NULL

Last_statement_latency: NULL

Current_memory: 556.53 KiB

Last_wait: wait/synch/cond/sql/MYSQL_BIN_LOG::COND_done

Last_wait_latency: Still Waiting

Source: binlog.cc:1894

Trx_latency: 12.97 s

Trx_state: ACTIVE

Trx_autocommit: YES

Pid: 15312

Program_name: mysql

It is not difficult to see that current_statement: commit waiting is

Last_wait: wait/io/file/sql/binlog

Last_wait_latency: Still Waiting

Source: binlog.cc:8469

And the blocked insert waiting is

Last_wait: wait/synch/cond/sql/MYSQL_BIN_LOG::COND_done

Last_wait_latency: Still Waiting

Source: binlog.cc:1894

This is the content of the binlog.cc:8469 that commit is waiting for.

/ * *

On * pure non-transactional* workloads there is a small window

In time where a concurrent rotate might be able to close

The file before the sync is actually done. In that case

Ignore the bad file descriptor errors.

Transactional workloads (InnoDB) are not affected since the

The rotation will not happen until all transactions have

Committed to the storage engine, thence decreased the XID

Counters. (key points)

TODO: fix this properly even for non-transactional storage

Engines.

, /

If (DBUG_EVALUATE_IF ("simulate_error_during_sync_binlog_file", 1

Mysql_file_sync (log_file.file

MYF (MY_WME | MY_IGNORE_BADFD)

You can see the reason from the comments on the source code. Let's take a look at him using int my_sync (File fd, myf my_flags).

Now that you have added the file descriptor fd, you should be writing binlog file

Here is the content of source: binlog.cc:1894 that insert is waiting for

Mysql_cond_wait (& m_cond_done, & m_lock_done)

This place just can't get the lock. It's a function called by Stage_manager::enroll_for.

The Stage_manager source code comments are as follows:

/ * *

Class for maintaining the commit stages for binary log group commit.

, /

The description of the enroll_for method is:

/ * *

Enroll a set of sessions for a stage.

This will queue the session thread for writing and flushing.

If the thread being queued is assigned as stage leader, it will

Return immediately.

If wait_if_follower is true the thread is not the stage leader

The thread will be wait for the queue to be processed by the

Leader before it returns. (key points)

In DBUG-ON version the follower marks is preempt status as ready.

@ param stage Stage identifier for the queue to append to.

@ param first Queue to append.

@ param stage_mutex

Pointer to the currently held stage mutex, or NULL if

We're not in a stage.

@ retval true Thread is stage leader.

@ retval false Thread was not stage leader and processing has been done.

, /

So it is true that binary log does have a write policy after commit, depending on whether it is written for leader or not.

If it is not leader, it will not be written until all the leader has been written, and multiple sessions waiting to be written to binlog will be counted.

In a queue, this seems to be parallel.

I use GDB to hit the breakpoint

Stage_manager::enroll_for

MYSQL_BIN_LOG::sync_binlog_file

The whole process is found to be as follows:

A session COMMIT-- > get the lock-- > binlog write-- > commit complete

B session COMMIT-- > wait for lock-- > get lock-- > perform binlog write-- > commit complete

7. How to avoid

There is no good way. If you use innodb double one, you can only modify a small amount of data at a time and increase the number of modifications.

As I said earlier, the amount of binlog generated is about 2 to 3 for you to modify the amount of data. If update is 4 to 3, then you should evaluate your Imax O.

Performance, if sequential writes are written to X per second, the amount of data that needs to be modified is about Y

that

S (seconds) = (Y* (2can3)) / (XUni2)

Control S to a range that you think is acceptable. In fact, you can use copy to test the performance of your sequence.

For example, we write a stored procedure like this:

DELIMITER / /

CREATE PROCEDURE d_move (in beid int,in cunt int,in rws int,in maxid int)

Begin

Declare num int

Declare nowid int

Set num = 1

Set nowid = beid

While num = nowid and id

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report