Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Analyze a strange blocking phenomenon of FTWRL in MySQL

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

I hope you can read it carefully and be able to achieve something!

One or two different phenomena

First of all, it is OK to set up a table with several pieces of data. Here is the baguait1 table.

Case 1SESSION1SESSION2SESSION3 step 1:select sleep (1000) from baguait1 for update

Step 2:flush table with read lock; blockage

Step 3:kill session2

Step 4:select * from baguait1 limit 1; successful

Step 2 the wait status of the "flush table with read lock;" operation is "Waiting for global read lock", as follows:

Mysql > select Id,State,Info from information_schema.processlist where command'sleep' +- -+ | Id | State | Info | +-+-- -+ | 1 | Waiting on empty queue | NULL | | 18 | Waiting for global read lock | flush table With read lock | | 3 | User sleep | select sleep (1000) from baguait1 for update | | 6 | executing | select Id State Info from information_schema.processlist where command'sleep' | +-+ -+ case 2

It's kind of strange here. In fact, I met and tested it a long time ago, but I didn't study it carefully. I just took a closer look at it this time.

SESSION1SESSION2SESSION3 step 1:select sleep (1000) from baguait1

Step 2:flush table with read lock; blockage

Step 3:kill session2

Step 4:select * from baguait1 limit 1; jam

Step 2 the waiting status of the "flush table with read lock;" operation is "Waiting for table flush", which is as follows:

Mysql > select Id,State,Info from information_schema.processlist where command'sleep' +-- + | Id | State | | Info | +-+ -+ | 1 | Waiting on empty queue | NULL | | 26 | User sleep | select sleep (1000) from baguait1 | | 23 | Waiting for table flush | flush table with read lock | | 6 | executing | select Id | State,Info from information_schema.processlist where command'sleep' | +-+-+

Mysql > select Id,State,Info from information_schema.processlist where command'sleep' +-- + | Id | State | | Info | +-+ -+ | 1 | Waiting on empty queue | NULL | | 26 | User sleep | select sleep (1000) from baguait1 | | 27 | executing | select Id | State Info from information_schema.processlist where command'sleep' | | 6 | Waiting for table flush | select * from testmts.baguait1 limit 1 | +-+-- -- +

If we carefully compare the two cases, the only real difference is whether the select statement in step 1 adds for update. In case 2, we find that even if we KILL the "flush table with read lock;" session, it will block all subsequent operations on this table (including select), and the wait will actually last until the sleep operation in step 1 is completed.

For online databases, this will happen if you execute "flush table with read lock;" during a long select table, which will cause all operations on this table to wait, even if you find out and kill the FTWRL session, which will last until the select operation is completed, unless you KILL the select operation for a long time.

Why did this happen? Let's analyze it slowly next.

Second, the effective point of sleep function.

In this case, I use the sleep function instead of the select big table operation as a test, which is true here. Why is it true? let's take a look at the effective points of the sleep function are as follows:

| evaluate_join_recordT@3: | | > evaluate_join_recordT@3: | enter: join: 0x7ffee0007350 join_tab index: 0 table: tii cond: 0x0T@3: | counts: evaluate_join_record join- > examined_rows++: 1T@3: | > end_sendT@3: | > Query_result_send::send_dataT@3: | | | > send_result_set_rowT@3: | > THD::enter_condT@3: | THD::enter_stage: 'User sleep' / mysqldata/percona-server-locks-detail-5.7.22/sql/item_func.cc:6057T@3: | > PROFILING::status_changeT@3: | | | variables.lock_wait_timeout) / / waiting time is affected by lock_wait_timeout {/ * NOTE: my_error () has been already called by reopen_tables () within close_cached_tables (). * / result= 1;} if (thd- > global_read_lock.make_global_read_lock_block_commit (thd)) / / MDL COMMIT lock {/ * Don't leave things in a half-locked state * / thd- > global_read_lock.unlock_global_read_lock (thd); return 1;}

More specifically, the operation of closing the table and releasing the table cache are included in the function close_cached_tables, so I won't go into detail. But we need to understand that the table cache actually consists of two parts:

Table cache define: a static table definition structure memory is established when each table is opened for the first time, and when multiple sessions access the same table at the same time, the corresponding instance is copied from here for the session's own use. The size is defined by the parameter table_definition_cache, and the number currently used is viewed by the status value Open_table_definitions. The corresponding function get_table_share.

Table cache instance: as mentioned above, the table definition structure that the session actually uses is an instance. The size is defined by the parameter table_open_cache, and the number currently used is viewed by the status value Open_tables. The corresponding function open_table_from_share.

Here I call it table caching. OK, here are the general steps of FTWRl that I have summarized:

Step 1: add the MDL LOCK type to GLOBAL and the level to S. If it occurs, the wait status is' Waiting for global read lock'. Note that select statements are not locked at the GLOBAL level, but DML/DDL/FOR UPDATE statements are locked at the GLOBAL level, and this wait occurs when the IX lock is incompatible with the S lock. Here is the compatibility matrix:

| | Type of active | Request | scoped lock | type | IS (*) IX S X |-+-+ IS | + + | IX | + +-| S | +-- | X | +-- |

Step 2: promote the global table cache version. In the source code is a global variable refresh_version++.

Step 3: release unused table caches. You can refer to the close_cached_tables function by yourself.

Step 4: determine whether there is an occupied table cache, and if so, wait for the occupier to release. The wait status is' Waiting for table flush'. This step determines whether the version of the table cache matches the global table cache version. If not, wait as follows:

For (uint idx=0; idx

< table_def_cache.records ; idx++) { share= (TABLE_SHARE*) my_hash_element(&table_def_cache, idx); //寻找整个 table cache shared hash结构 if (share->

Has_old_version () / / if the version is inconsistent with the current refresh_version version {found= TRUE; break; / / jump out of the first layer to find out if there is an old version}}... if (found) / / if you find the old version, you need to wait for {/ * The method below temporarily unlocks LOCK_open and frees share's memory. * / if (share- > wait_for_old_version (thd, & abstime, MDL_wait_for_subgraph::DEADLOCK_WEIGHT_DDL) {mysql_mutex_unlock (& LOCK_open); result= TRUE; goto err_with_reopen;}}

The end of the wait is the release of the occupant of the occupied table cache. This release operation exists in the function close_thread_table, as follows:

If (table- > s-> has_old_version () | | table- > needs_reopen () | | table_def_shutdown_in_progress) {tc- > remove_table (table); / / close table cache instance mysql_mutex_lock (& LOCK_open); intern_close_table (table); / / remove table cache define mysql_mutex_unlock (& LOCK_open);}

The function MDL_wait::set_status will eventually be called to wake up the FTWRL, which means that for the table cache being occupied, the freer is not the FTWRL session but the occupant himself. In any case, the entire table cache will eventually be emptied, and if you check the Open_table_definitions and Open_tables after FTWRL, you will find that the count has been re-counted. Here is the code for the wake-up function, which is also obvious:

Bool MDL_wait::set_status (enum_wait_status status_arg) open_table {bool was_occupied= TRUE; mysql_mutex_lock (& m_LOCK_wait_status); if (massively waitworthy status = = EMPTY) {was_occupied= FALSE; masked waitstatus = status_arg; mysql_cond_signal (& m_COND_wait_status); / / Wake up} mysql_mutex_unlock (& m_LOCK_wait_status); / / unlock return was_occupied;}

Step 5: add the MDL LOCK type COMMIT level to S. If it occurs, the wait status is' Waiting for commit lock'. This kind of wait is likely to occur if there is a commit of a large transaction.

IV. Case 1 Analysis

Step 1 We use the select for update statement, which adds a GLOBAL-level IX lock until the end of the statement (note that an object-level MDL_SHARED_WRITE (SW) lock is actually added until the end of the transaction, which has nothing to do with FTWRL.

Step 2 We use the FTWRL statement to acquire the GLOBAL-level S lock according to the above analysis, which is not compatible, so there is a wait 'Waiting for global read lock'.

In step 3, we KILL lost the FTWRL session, in which case the session exited, and FTWRL had no effect as if it had never been executed, because it was blocked at the first step.

Step 4 our select operation will not be affected in any way

Fifth, case 2 analysis

Step 1 We use the select statement, which does not have any locks at the GLOBAL level (note that object-level MDL_SHARED_READ (SR) locks are actually added until the end of the transaction, which has nothing to do with FTWRL)

Step 2 We use the FTWRL statement, and based on the above analysis, we find that the FTWRL statement can acquire the S lock at the GLOBAL level, because the simple select statement does not have any locks at the GLOBAL level. At the same time, the global table cache version is pushed and the unused table cache is released, but in step 4 you will find that baguait1's table cache is being occupied, so there is a wait with a wait status of 'Waiting for table flush'.

Step 3 We lost the FTWRL session in KILL, in which case although the GLOBAL-level S lock will be released, the global table cache version has been advanced, and the unused table cache has been released.

Step 4: execute a select query operation on the baguait1 table again. When opening the table, it will determine whether the table cached version matches the global table cached version. If it does not match, enter the wait and wait for 'Waiting for table flush',:

If (share- > has_old_version ()) {/ * We already have an MDL lock. But we have encountered an old version of table in the table definition cache which is possible when someone changes the table version directly in the cache without acquiring a metadata lock (e.g. This can happen during "rolling" FLUSH TABLE (S)). Release our reference to share, wait until old version of share goes away and then try to get new version of table share. * / release_table_share (share); Wait_result= tdc_wait_for_old_version (thd, table_list- > db, table_list- > table_name, ot_ctx- > get_timeout (), deadlock_weight)

The entire wait operation, like FTWRL, waits for the occupant to release the table cache before waking up to continue.

Therefore, all the subsequent select/DML/DDL in this table will be blocked and expensive, even if the KILL is useless without the FTWRL session.

A simple summary of FTWRL blocking and being blocked (1) what is blocked?

DDL\ DML\ FOR UPDATE blocks FTWRL for a long time because FTWRL needs to acquire GLOBAL's S lock, and these statements hold IX (MDL_INTENTION_EXCLUSIVE) locks on GLOBAL, which are not compatible according to the compatibility matrix. Wait for: Waiting for global read lock. This is the case in case 1 of this article.

Select blocks FTWRL for a long time because FTWRL frees up all free table caches, and if an occupier occupies some table cache, it waits for the occupier to release those table caches themselves. Wait for: Waiting for table flush. This is the case in case 2 of this article, which blocks any subsequent statements about this table, not even a KILL FTWRL session, unless KILL drops a long select operation. In fact, this kind of congestion can also exist in flush table.

Long commit (such as large transaction commits) can also block FTWRL because FTWRL needs to acquire COMMIT's S lock, while commit statements hold IX (MDL_INTENTION_EXCLUSIVE) locks on commit, which are not compatible according to the compatibility matrix.

(2) what is blocked?

FTWRL will block the DDL\ DML\ FOR UPDATE operation, the blocking point is the GLOBAL-level S lock, and the wait is: Waiting for global read lock.

FTWRL will block the commit operation, the blocking point is COMMIT's S lock, and the wait is Waiting for commit lock.

FTWRL does not block select operations because select is not locked at the GLOBAL level.

Finally, I would like to remind you that many backup tools have to perform FTWRL operations, and be sure to pay attention to its congestion scenarios and special scenarios.

Remarks stack frames and breakpoints:

(1) breakpoints used

MDL_context::acquire_lock gets DML LOCK

Open_table_from_share gets table cache instance

Alloc_table_share assign table define (share)

Get_table_share gets table define (share)

Close_cached_tables flush table shuts down all table cache instance and table define

Reload_acl_and_cache flush with read lock MDL LOCK lock to GLOBAL TYPE:S, and call close_cached_tables to get COMMIT level TYPE S at the same time

MDL_wait::set_status Wake up operation

Close_thread_table occupant judges release

This delete operation is required for the my_hash_delete hash delete operation and the release of the table cache from both table cache instance and table define.

(2) FTWRL blocking stack frames due to select blocking stack frames:

(gdb) bt#0 0x00007ffff7bd3a5e in pthread_cond_timedwait@@GLIBC_2.3.2 () from / lib64/libpthread.so.0#1 0x000000000192027b in native_cond_timedwait (cond=0x7ffedc007c78, mutex=0x7ffedc007c30, abstime=0x7fffec5bbb90) at / mysqldata/percona-server-locks-detail-5.7.22/include/thr_cond.h:129#2 0x00000000019205ea in safe_cond_timedwait (cond=0x7ffedc007c78, mp=0x7ffedc007c08, abstime=0x7fffec5bbb90, file=0x204cdd0 "/ mysqldata/percona-server-locks-detail-5.7.22/sql/mdl.cc" Line=1899) at / mysqldata/percona-server-locks-detail-5.7.22/mysys/thr_cond.c:88#3 0x00000000014b9f21 in my_cond_timedwait (cond=0x7ffedc007c78, mp=0x7ffedc007c08, abstime=0x7fffec5bbb90, file=0x204cdd0 "/ mysqldata/percona-server-locks-detail-5.7.22/sql/mdl.cc", line=1899) at / mysqldata/percona-server-locks-detail-5.7.22/include/thr_cond.h:180#4 0x00000000014ba484 in inline_mysql_cond_timedwait (that=0x7ffedc007c78, mutex=0x7ffedc007c08, abstime=0x7fffec5bbb90 Src_file=0x204cdd0 "/ mysqldata/percona-server-locks-detail-5.7.22/sql/mdl.cc", src_line=1899) at / mysqldata/percona-server-locks-detail-5.7.22/include/mysql/psi/mysql_thread.h:1229#5 0x00000000014bb702 in MDL_wait::timed_wait (this=0x7ffedc007c08, owner=0x7ffedc007b70, abs_timeout=0x7fffec5bbb90, set_status_on_timeout=true Wait_state_name=0x2d897b0) at / mysqldata/percona-server-locks-detail-5.7.22/sql/mdl.cc:1899#6 0x00000000016cdb30 in TABLE_SHARE::wait_for_old_version (this=0x7ffee0a4fc30, thd=0x7ffedc007b70, abstime=0x7fffec5bbb90, deadlock_weight=100) at / mysqldata/percona-server-locks-detail-5.7.22/sql/table.cc:4717#7 0x000000000153829b in close_cached_tables (thd=0x7ffedc007b70, tables=0x0, wait_for_refresh=true) Timeout=31536000) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:1291#8 0x00000000016123ec in reload_acl_and_cache (thd=0x7ffedc007b70, options=16388, tables=0x0, write_to_binlog=0x7fffec5bc9dc) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_reload.cc:224#9 0x00000000015cee9c in mysql_execute_command (thd=0x7ffedc007b70 First_level=true) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:4433#10 0x00000000015d2fde in mysql_parse (thd=0x7ffedc007b70, parser_state=0x7fffec5bd600) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:5901#11 0x00000000015c6b72 in dispatch_command (thd=0x7ffedc007b70, com_data=0x7fffec5bdd70, command=COM_QUERY) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:1490

(3) other select operations wait for the stack frame after the kill point FTWRL session:

# 0 MDL_wait::timed_wait (this=0x7ffee8008298, owner=0x7ffee8008200, abs_timeout=0x7fffec58a600, set_status_on_timeout=true, wait_state_name=0x2d897b0) at / mysqldata/percona-server-locks-detail-5.7.22/sql/mdl.cc:1888#1 0x00000000016cdb30 in TABLE_SHARE::wait_for_old_version (this=0x7ffee0011620, thd=0x7ffee8008200, abstime=0x7fffec58a600, deadlock_weight=0) at / mysqldata/percona-server-locks-detail-5.7.22/sql/table.cc:4717#2 0x000000000153b6ba in tdc_wait_for_old_version (thd=0x7ffee8008200 Db=0x7ffee80014a0 "testmts", table_name=0x7ffee80014a8 "tii", wait_timeout=31536000, deadlock_weight=0) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:2957#3 0x000000000153ca97 in open_table (thd=0x7ffee8008200, table_list=0x7ffee8001708, ot_ctx=0x7fffec58aab0) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:3548#4 0x000000000153f904 in open_and_process_table (thd=0x7ffee8008200, lex=0x7ffee800a830, tables=0x7ffee8001708, counter=0x7ffee800a8f0, flags=0, prelocking_strategy=0x7fffec58abe0, has_prelocking_list=false Ot_ctx=0x7fffec58aab0) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:5213#5 0x0000000001540a58 in open_tables (thd=0x7ffee8008200, start=0x7fffec58aba0, counter=0x7ffee800a8f0, flags=0, prelocking_strategy=0x7fffec58abe0) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:5831#6 0x0000000001541e93 in open_tables_for_query (thd=0x7ffee8008200, tables=0x7ffee8001708) Flags=0) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:6606#7 0x00000000015d1dca in execute_sqlcom_select (thd=0x7ffee8008200, all_tables=0x7ffee8001708) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:5416#8 0x00000000015ca380 in mysql_execute_command (thd=0x7ffee8008200, first_level=true) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:2939#9 0x00000000015d2fde in mysql_parse (thd=0x7ffee8008200 Parser_state=0x7fffec58c600) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:5901#10 0x00000000015c6b72 in dispatch_command (thd=0x7ffee8008200, com_data=0x7fffec58cd70, command=COM_QUERY) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:1490

(4) the occupant releases the wake-up FTWRL stack frame:

Breakpoint 3, MDL_wait::set_status (this=0x7ffedc000c78, status_arg=MDL_wait::GRANTED) at / mysqldata/percona-server-locks-detail-5.7.22/sql/mdl.cc:18321832 bool was_occupied= TRUE (gdb) bt#0 MDL_wait::set_status (this=0x7ffedc000c78 Status_arg=MDL_wait::GRANTED) at / mysqldata/percona-server-locks-detail-5.7.22/sql/mdl.cc:1832#1 0x00000000016c2483 in free_table_share (share=0x7ffee0011620) at / mysqldata/percona-server-locks-detail-5.7.22/sql/table.cc:607#2 0x0000000001536a22 in table_def_free_entry (share=0x7ffee0011620) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:524#3 0x00000000018fd7aa in my_hash_delete (hash=0x2e4cfe0 Record=0x7ffee0011620 "\ 002") at / mysqldata/percona-server-locks-detail-5.7.22/mysys/hash.c:625#4 0x0000000001537673 in release_table_share (share=0x7ffee0011620) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:949#5 0x00000000016cad10 in closefrm (table=0x7ffee000f280 Free_share=true) at / mysqldata/percona-server-locks-detail-5.7.22/sql/table.cc:3597#6 0x0000000001537d0e in intern_close_table (table=0x7ffee000f280) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:1109#7 0x0000000001539054 in close_thread_table (thd=0x7ffee0000c00 Table_ptr=0x7ffee0000c68) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:1780#8 0x00000000015385fe in close_open_tables (thd=0x7ffee0000c00) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:1443#9 0x0000000001538d4a in close_thread_tables (thd=0x7ffee0000c00) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_base.cc:1722#10 0x00000000015d19bc in mysql_execute_command (thd=0x7ffee0000c00 First_level=true) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:5307#11 0x00000000015d2fde in mysql_parse (thd=0x7ffee0000c00, parser_state=0x7fffec5ee600) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:5901#12 0x00000000015c6b72 in dispatch_command (thd=0x7ffee0000c00, com_data=0x7fffec5eed70) Command=COM_QUERY) at / mysqldata/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:1490 "analyzing a strange blocking phenomenon of FTWRL in MySQL" ends here. Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report