The design principle of InnoDB Redo Log and what is the source code? 07/11 Update SLTechnology News&Howtos

The design principle of InnoDB Redo Log and what is the source code?

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article shows you the design principle of InnoDB Redo Log and how the source code is, the content is concise and easy to understand, absolutely can make your eyes bright, through the detailed introduction of this article, I hope you can get something.

This article focuses on the process of InnoDB Redo Log. Redo Log is the key for InnoDB to achieve data consistency and persistent storage. Starting from the design principle and part of the source code implementation, the knowledge points are summarized.

Redo Log Buffer & Redo Log File

As described in the previous two articles, InnoDB uses Redo Log to ensure data consistency and persistence, and it uses the WAL mechanism, which writes logs before writing data. Specifically, when InnoDB carries out write operations, it records the data operations in log buffer, then brushes the data in log buffer to disk log file, and then the subsequent data falls into the data ibd file. This step is guaranteed by checkpoint. The data flow is shown in the following figure.

There is another detail worth noting: although for a transaction, it still needs to be successfully written to the binlog file before the transaction is considered complete, from the perspective of the storage layer alone, at a certain moment, the data file plus Redo Log File is a complete snapshot of the database at this time. Physical backup tools like Xtrabackup do full backup by copying data files and Redo Log File.

Redo Log metadata and its initialization process

InnoDB uses the object log_sys to manage Redo Log Buffer, whose structure is log_t, which is defined in the source code (some of the data is collapsed) as follows.

Log_t structure definition (/ innobase/include/log0log.h)

Log_sys mainly includes the following metadata information:

There is a very important concept in log_sys metadata information-LSN.

The LSN log sequence number (Log Sequence Number), which represents the sequence number of Redo Log, is monotonously increasing. Each time a Redo Log is written, LSN increments the number of bytes written by that Redo Log. Therefore, LSN records the timing generated by each Redo Log like a point in time, and corresponds to Redo Log one by one. We will describe how to convert between LSN and Redo Log later.

When InnoDB starts, the log_sys object is initialized by the log_init () function, which mainly assigns values to the metadata information. Part of the code implementation is as follows:

Log_init () function implementation (/ innobase/log/log0log.cc)

Another important step in the log_init () function is to initialize the log object log_block. Log_block is the smallest data management unit of Redo Log, with a size of 512 Bytes. Log_block does not have a separate structure to manage, and its metadata information exists in its first 12 bytes, which is called log_block_header. In addition, log_block also includes 8 bytes of log_block_tail information, so the actual storage space of each log_block is 492 Bytes. The structure information of log_sys and log_block is shown in the following figure.

As noted in the figure above, log_block_header contains the following information:

The log_block_init () function does the following:

Log_block_init () function implementation (/ innobase/include/log0log.ic)

First, the number of the log_block, log_block_hdr_no, is calculated from the lsn value of the log_block. As mentioned earlier, the log_block has no other data structure to organize, so the number uniquely identifies the location of the log_block in the log buffer. The conversion from lsn to log_block_hdr_no is implemented by the function log_block_convert_lsn_to_no (), and the implementation logic is as follows:

UNIV_INLINEulintlog_block_convert_lsn_to_no (/ * = * / lsn_t lsn) / *! < in: lsn of a byte within the block * / {return ((ulint) (lsn / OS_FILE_LOG_BLOCK_SIZE) & 0x3FFFFFFFUL) + 1);}

This logic is relatively simple, because logs are stored in log_block_size (512 Bytes), and lsn increases monotonously, so lsn is actually a multiple of log_block_size plus the current offset.

After calculating the log_block number, the log_block_set_hdr_no () function records the number in the first four bytes of the log_block. Then the log_block_set_data_len () function sets the space LOG_BLOCK_HDR_SIZE that the log_block has used by default, that is, 12 Bytes.

In the process of actually recording log, the size of a write log may be greater than 512 Bytes, that is, it takes up multiple log_block in succession, so the log_block may contain the content of multiple log, so the offset of the first log in the log_block is recorded with log_block_first_rec_group. The value of this variable is also set to the size of LOG_BLOCK_HDR_SIZE.

The entire process of log_init () mentioned above is implemented during the initialization of the InnoDB storage engine. Learning a complex system, starting from its initialization process will be more helpful, just like learning a file system, starting from xxxfs_init (), you can quickly understand how super_block is initialized, how inodes is organized, and learning InnoDB is the same. The initialization process of InnoDB is realized by the function innobase_start_or_create_for_mysql (), which is responsible for initializing all kinds of metadata information needed by InnoDB, such as Buffer Pool structure, LRU linked list, various file data structures, log data structures, lock information, setting various parameter variables of InnoDB, and so on. In addition to the log_init () process we are talking about, it also includes fil_init (), buf_pool_init (), fsp_init (), buf_flush_page_cleaner_init (), and so on. For example, the work done by the buf_pool_init () process initializes the structure buf_pool_t of the Buffer Pool Instance object, including the initialization of buffer_pool_chunks, multiple LRU linked lists and pointer variables, and so on.

CheckPoint & LSN

As mentioned earlier, the data dropping operation in Buffer Pool is performed by checkpoint. Checkpoint is responsible for synchronizing memory data into disk data files to ensure data consistency. The execution of checkpoint determines the time required for database disaster recovery. When the database is undergoing disaster recovery, only the data in the log from checkpoint to the nearest Redo Log can be recovered to the state before downtime. LSN records the timing of Redo Log generation, and logs are written to log buffer and log buffer data are sent to log file, all of which are pushed forward according to LSN transactions. And checkpoint also carries on the operation of brushing disk according to the timing of LSN. By executing show engine innodb status in MySQL, you can see:

Due to the small amount of writing in the system, the lsn generated by the current system is consistent with the lsn value that has been brushed to Redo Log File. However, the progress of data page flushing and the implementation of checkpoint in Buffer Pool are relatively backward.

Group Commit

InnoDB allows Redo Log generated by multiple transactions to be committed together to reduce disk IDUBO. First of all, let's explain the difference between Redo Log and Binlog. Redo Log is a log unique to the InnoDB storage layer, which records changes in physical page data, while Binlog is a logical log that is recorded by all MySQL. Binlog writes only once when a transaction commits, while Redo Log writes at a variety of times. First of all, when the transaction commits, if the parameter innodb_flush_log_at_trx_commit is set to 1, the flushing operation will be carried out; secondly, when the Redo Log Buffer space is insufficient, the system will also force the flushing operation; in addition, the background thread will also perform the flushing operation once a second.

In the case of multiple concurrent transactions, the log generated by a transaction during the transaction (which has been written to log buffer) may be brought to the log file by the commit of other transactions, or may be brushed by the background thread, thus achieving the effect of Group Commit. At the same time, you can also see that the log generated by a transaction is not written to log file at one time when the transaction is committed, but is constantly written to log file during the transaction.

The above is the design principle of InnoDB Redo Log and what the source code is. Have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.