In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the InnoDB IO path source code example analysis, the article introduces in great detail, has a certain reference value, interested friends must read!
InnoDB implements that IO Flush converges through the "os_file_flush" macro, and when expanded by macro, it becomes "os_file_flush_func".
Next, let's focus on what other scenarios will call this os_file_flush_func function:
1. In buf_dblwr_init_or_load_pages, when double buffer does crash recovery, if reset_space_ids is set to "true"
2. In fil_create_new_single_table_tablespace, when creating a table data file
3. In fil_user_tablespace_restore_page, when copy page from double writer buffer
4. In fil_tablespace_iterate, if iterate page fails, a sync will be done.
5. In os_file_set_size, initialize the file size when the file is newly created
As you can see from the above, it is very rare to call os_file_flush_func directly. Then there are some in the system that call the upper-level function fil_flush of os_file_flush_func:
1. In buf_dblwr_flush_buffered_writes, flush double write buffer to disk
2. In buf_dblwr_write_single_page, a single page flush to disk
3. In buf_flush_write_block_low, flush block to disk is usually called after calling double write buffer "buf_dblwr_flush_buffered_writes".
4. When you delete the page in a tablespace in buf_LRU_remove_pages, make the sync call
5. In fil_rename_tablespace, call when rename a tablespace
6. In fil_extend_space_to_desired_size, when extend space
7. In fil_flush_file_spaces, when flush batch tablespace's page, it is called by ha_innodb 's force checkpoint, shutdown DB, etc.
8. In log_io_complete, when log io is completed, checkpoint write and others, whether to call it or not is affected by sync and is not dependent on commit.
9. In log_write_up_to, when commit transactions, this is the key point. The log file of innodb will be called and synchronized.
10. In create_log_files_rename, when rename log file
11. In innobase_start_or_create_for_mysql, when you create a new file
In the fil_flush function, the flush cache behavior of all files is protected by fil_system- > mutex. So whether it's data file or log file, the file-level flush is serialized. So how exactly do you control it?
1. Check the n_pending_flushes in file node and retry all the time if it is greater than "0"
two。 If "0", enter the flush phase and add "1" to n_pending_flushe before officially starting flush. This operation is protected by "fil- > mutex" mentioned above.
3. After calling "os_file_flush" flush, n_pending_flushes minus "1" is also protected by "fil- > mutex"
Here is the code for retry in MySQL:
Retry:
If (node- > n_pending_flushes > 0) {
/ * We want to avoid calling os_file_flush () on
The file twice at the same time, because we do
Not know what bugs OS's may contain in file
ICompo * /
Ib_int64_t sig_count =
Os_event_reset (node- > sync_event)
Mutex_exit (& fil_system- > mutex)
Os_event_wait_low (node- > sync_event, sig_count)
Mutex_enter (& fil_system- > mutex)
If (node- > flush_counter > = old_mod_counter) {
Goto skip_flush
}
Goto retry
}
Here is the code for flush in MySQL:
Ut_a (node- > open)
File = node- > handle
Node- > naming flushesitation +
Mutex_exit (& fil_system- > mutex)
Os_file_flush (file)
Mutex_enter (& fil_system- > mutex)
Os_event_set (node- > sync_event)
Node- > inflexible flusho-
Node- > flush_size = node- > size
The logic mentioned above is the same for log file. The tablespace of Log file is log space. For log file, it is also protected by log_sys- > mutex, in the log_write_up_to function.
The flush behavior of Log file converges to the body of the "log_write_up_to" function, then calls fil_flush, and finally goes to os_file_flush_func.
When MySQL is in commit, if "innodb_flush_log_at_trx_commit=1", the synchronized log_write_up_to is called twice, once for innobase log file's flush and once for bin log's flush.
If bin log is turned off, the sync_blog branch will not be taken in the ordered_commit function. The following is the execution path of innobase flush log file when turned off and not turned off.
The pstack of Innobase without binlog is as follows:
Fsync,os_file_fsync,os_file_flush_func,fil_flush,log_write_up_to,trx_flush_log_if_needed_low,trx_flush_log_if_needed,trx_commit_complete_for_mysql,innobase_commit,ha_commit_low,TC_LOG_DUMMY::commit,ha_commit_trans,trans_commit_stmt,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone
The flush pstack with Binlog is as follows:
Fsync,os_file_fsync,os_file_flush_func,fil_flush,log_write_up_to,innobase_flush_logs,flush_handlerton,plugin_foreach_with_mask,ha_flush_logs,MYSQL_BIN_LOG::process_flush_stage_queue,MYSQL_BIN_LOG::ordered_commit,MYSQL_BIN_LOG::commit,ha_commit_trans,trans_commit_stmt,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection Start_thread,clone
Judging from the rough calls above, log_write_up_to () and log_io_complete () will be the focus, because these flush are serial at the file level, and the rt in commit is mainly brought about by these serializations.
Next, let's take a look at what log_write_up_to callers have:
1. Buf_flush_write_block_low, which is called in force flush to ensure that the log must land before the data. When brushing dirty pages, page_cleaner_do_flush_batch () initiates the call to this function.
2. Innobase_flush_logs, which is called when commit, is mainly called by the flush binlog branch
3. Trx_flush_log_if_needed_low, called when commit, mainly called by flush innodb log file
4. Log_buffer_flush_to_disk,log buffer swipes the log to disk
5. Log_buffer_sync_in_background, the background thread synchronizes log buffer to disk, which is called by the srv_master_thread thread to srv_sync_log_buffer_in_background (), once per second
6. Log_flush_margin, which is mainly called when vacating log buffer space
7. Log_checkpoint, which is mainly called when checkpoint is done, and srv_master_do_idle_tasks () is called by the srv_master thread, once per second.
The call to the log_io_complete () function:
1. This method will be called if it is log io in fil_aio_wait,aio wait
From the above analysis, we can see that the main impact on RT is the log_sys- > mutex contention caused by brushing dirty pages. In addition, log_buffer_sync_in_background and log_checkpoint are both called by the background srv_master_ thread every other second.
However, these two methods do not necessarily implement fil_flush, so they are not the main cause of the impact. After gdb is hung up, you will roughly go to the pstack of fil_flush contention as follows:
Breakpoint 13, fil_flush (space_id=4294967280, from=FLUSH_FROM_LOG_WRITE_UP_TO) at / u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478 6478 {(gdb) bt # 0fil _ flush (space_id=4294967280, from=FLUSH_FROM_LOG_WRITE_UP_TO) at / u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478 # 1 0x0000000000c890e5 in log_write_up_to (lsn=, wait=, flush_to_disk=1 Caller=) at / u01/mysql-5.6/storage/innobase/log/log0log.cc:1674 # 2 0x0000000000d79d26 in buf_flush_write_block_low (sync=false, flush_type=BUF_FLUSH_LIST, bpage=0x7fd706332330) at / u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:902 # 3 buf_flush_page (buf_pool=, bpage=0x7fd706332330, flush_type=BUF_FLUSH_LIST, sync=) at / u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1061 # 4 0x0000000000d7a43e in buf_flush_try_neighbors (space=0 Offset=offset@entry=5, flush_type=flush_type@entry=BUF_FLUSH_LIST, n_flushed=n_flushed@entry=1, n_to_flush=n_to_flush@entry=250) at / u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1271 # 5 0x0000000000d7b1b1 in buf_flush_page_and_try_neighbors (flush_type=BUF_FLUSH_LIST, count=, n_to_flush=250, bpage=) at / u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1355 # 6 buf_do_flush_list_batch (buf_pool=buf_pool@entry=0x1f817d8 Min_n=min_n@entry=250, lsn_limit=lsn_limit@entry=18446744073709551615) at / u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1623 # 7 0x0000000000d7b308 in buf_flush_batch (flush_type=BUF_FLUSH_LIST, lsn_limit=18446744073709551615, min_n=, buf_pool=0x1f817d8) at / u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1693 # 8 buf_flush_list (min_n=, n_processed=n_processed@entry=0x7fc6dd7f9bd8 Lsn_limit=18446744073709551615) at / u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1939 # 9 0x0000000000d7c79b in page_cleaner_do_flush_batch (lsn_limit=18446744073709551615 ) at / u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2216 # 10 buf_flush_page_cleaner_thread (arg=) at / u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2588 # 11 0x00007fd950ef9dc5 in start_thread () from / lib64/libpthread.so.0 # 12 0x00007fd94ef4a28d in clone () from / lib64/libc.so.6 Breakpoint 5, fil_flush (space_id=4294967280 From=FLUSH_FROM_LOG_IO_COMPLETE) at / u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478 6478 {(gdb) bt # 0fil _ flush (space_id=4294967280 From=FLUSH_FROM_LOG_IO_COMPLETE) at / u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478 # 1 0x0000000000c86c78 in log_io_complete (group=) at / u01/mysql-5.6/storage/innobase/log/log0log.cc:1239 # 2 0x0000000000db5a4b in fil_aio_wait (segment=segment@entry=1) at / u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6463 # 3 0x0000000000d09ba0 in io_handler_thread (arg=) at / u01/mysql-5.6 / storage/innobase/srv/srv0start.cc:498 # 4 0x00007fb818efddc5 in start_thread () from / lib64/libpthread.so.0 # 5 0x00007fb816f4e28d in clone () from / lib64/libc.so.6 1: * node = {space = 0x30aa218 Name = 0x30aa948 "/ mnt/mysql-redo/my3308/data/ib_logfile0", open = 1, handle = 10, sync_event = 0x30aa980, is_raw_disk = 0, size = 262144, n_pending = 0, n_pending_flushes = 0, being_extended = 0, modification_counter = 49, flush_counter = 41, flush_size = 262144, chain = {prev = 0x0, next = 0x30aaa78}, LRU = {prev = 0x0, next = 0x0}, magic_n = 89389}
So, to sum up, it should be two fsync done by commit and one log_write_up_to () done by page cleaner. In addition, there is a fil_aio_wait when it is completed, if it is log io, it will do a log_io_complete ().
These four times of fsync will have an impact on the user's rt, the two times of commit are inevitable, and the next two times are to adjust the frequency at most. In addition, can the way of fsync be changed? This reader can think.
Worse than oracle implementation, oracle will not enforce it. I remember whether to give a tag, policy or follow redo's own policy. The implementation of oracle should take this into account.
The above is all the contents of this article "sample Analysis of InnoDB IO path Source Code". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.