Redo log for Crash Recovey (crash recovery) parsing of MySQL 07/15 Update SLTechnology News&Howtos

Redo log for Crash Recovey (crash recovery) parsing of MySQL

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The following content mainly brings you redo logs for MySQL Crash Recovey (crash recovery) analysis, the knowledge here is slightly different from books, are summed up by professional and technical personnel in the process of contact with users, have a certain experience sharing value, hope to bring help to the majority of readers.

In the physical file storage system of MySQL's InnoDB storage engine, in addition to the actual data files (ibd, ibdata), there are two very important log systems, namely redo log and undo log. Similar to Oracle, redo log records physical changes to the actual data file (where and how the data in the data file has changed). InnoDB also uses WAL (log priority), which means that the redo log has been dropped before the modification of the actual data file, so as to ensure the persistence of the transaction. Undo logs are used to ensure the atomicity and MVCC of transactions, and changes to data files produced by all undo operations are also recorded in the redo log.

In native MySQL, redo logs are not used for physical master-slave replication, and their main application scenarios are Crash Recovey (crash recovery) for MySQL. The crash recovery of MySQL InnoDB will be covered in a later article.

This paper mainly introduces the basic structure of redo Log based on MySQL 8.0.

1. Redo log file

Since MySQL 5.6, the log group feature has been abandoned (redo logs can be written in multiple copies), and there is an online view that the InnoDB development team may think that it might be better to use the outer storage hardware to ensure the integrity of the log group. At the same time, the archive log Archive of InnoDB has also been abandoned since 5.7. the archive log is used to archive all redo logs. In the redo log system, multiple log files of fixed size are written in a loop to store redo logs, and if full, the redo log will be looped to the beginning to start writing.

However, MySQL 8.0 introduces a mechanism called cloning, which seems to be used to remotely clone a copy of the current database from a code point of view, and a new Archive archiving mechanism is introduced in this mechanism. If you are interested, you can read the code in the storage\ innobase\ arch and storage\ innobase\ clone directories of the new version of MySQL8.0 source code.

1.1. Redo Log related parameters

Innodb_log_group_home_dir

This parameter is used to specify the path where the redo log is stored, and the log file is named after the ib_ log [number].

Innodb_log_files_in_group

Although MySQL has abandoned the concept of log groups, the parameter names are retained to be compatible with previous configurations. This parameter means how many log files there are (at least 2).

Innodb_log_file_size

Represents the size of each file.

Therefore, the total size of redo Log is innodb_log_files_in_group * innodb_log_file_size.

1.2. Redo Log cycle write

Redo Log writes the file sequentially, and when all files are full, it goes back to the corresponding starting position of the first file to overwrite (but when redo checkpoint is done, the header checkpoint tag of the first log file is also updated, so strictly speaking, it is not written sequentially). Within InnoDB, Redo Log is logically treated as a file corresponding to a space id (InnoDB organizes physical storage through the concept of space, including different tables Data dictionary, redo,undo, etc.).

The figure above is written in a loop with a specified innodb_log_files_in_group of 3.

2. Brief introduction of Redo Log storage format

Although Redo Log has multiple files, the composition of each file is the same, except that some data will only exist in the header of the first Log file (ib_logfile0), such as Buffer Pool flush checkpoint information will only be written in the header of the first log file.

2.1. Overview of log file storage structure

(ib_logfile0 storage overview chart)

(Overview of storage of redo log files other than ib_logfile0)

InnoDB also uses the mechanism of WAL to ensure the persistence of transactions. In a sense, redo logs are written sequentially and write very fast. After data modifications caused by database transactions enter the InnoDB storage layer, the records of these changes are stored in redo log, and then the changes of the data are written to Page Pool in memory. The background thread of InnoDB will drop dirty pages according to certain rules (such as timing or a certain proportion of dirty pages). After falling, it will record how many change logs in the current redo Log have been actually stored in the actual data space file. The total number of redo log writes is called the LSN (Log Secquence Numer) log sequence number, and the number of redo log changes actually written to the actual data file is called checkpoint LSN, indicating how many changes have actually been written to the corresponding data file. Once the database crashes and InnoDB starts to recover data, read checkpoint first, and then read the Redo log from the LSN indicated by checkpoint for data recovery, thus reducing the time of Crash Recovery.

Comparing the previous two overview diagrams, you can see that the checkpoint information is only stored in the first log file header. At the same time, we see that there are two checkpoint block fields in the log header. InnoDB uses two checkpoint to write in turn to ensure the security of the checkpoint (not two checkpoint at a time, but in turn). Also because Redo log is idempotent, it is the same to apply once and twice (in practical application Redo, if the lsn of the current log is greater than the lsn of the current page, it means that this log has not been applied to the current page). Therefore, even if a checkpoint block write fails, the recovery of the crash from the last recorded checkpoint point will recover the database transaction correctly.

2.2. Log File Header

(Log File Header storage structure)

Log Header Format:

This field was used to identify which log group the current Log file belongs to, but now the new meaning is to identify the current Log as the format version of the file. For example, a value of 0 means that the redo log was generated by a MySQL prior to 5.7.9.

PAD:

It doesn't mean anything, it's just used to do some alignment at the moment.

Start LSN:

This field is used in Clone and Archive scenarios and has nothing to do with general persistence and crash recovery, which is not discussed here.

Creator:

What is stored is the string of the name of the creator of the log file.

Left Bytes:

It doesn't mean anything right now, it's just used to fill the placeholder so that the block can reach 512 bytes.

2.3. Log Checkpoint

(Checkpoint block storage overview)

Checkpoint number:

Checkpoint number can be understood as the number of times the disk is written in the checkpoint domain, each time the disk is written by 1. At the same time, the value of module 2 can be used to realize two checkpoint domains to write in turn.

Checkpoint LSN:

This field indicates that log records smaller than this Checkpoint LSN have been written to the actual data file, and the Crash Recovery system starts data recovery from the first MTR record after the Checkpoint LSN.

Checkpoint offset:

Checkpoint LSN corresponds to the file offset in Log files, which is used to calibrate the conversion between LSN and Offset.

Buf size:

Only this field is written within the MySQL system and is read and then processed accordingly. It identifies the current Log buffer size of the system.

Left bytes:

It doesn't mean anything right now, it's just used to fill the placeholder so that the block can reach 512 bytes. But here the last four bytes are used to hold the Checksum of the checkpoint field.

2.4. Storage format of Log Block

(storage format of Log Block)

Log Block Number:

The number of the Log Block, incrementing from 1, reaching the maximum value (0x3FFFFFFF+1), and then continuing from 1.

Data length:

The number of bytes written to the current block, including the size of the header 12 bytes

Firsrt Record offset

The initial offset of the first mtr record in this Block

Log Block Checkpoint number

The checkpoint no in which the block is located

Log Records:

Multiple mtr records can be stored in a block, and the same mtr record can span multiple block.

2.5. Mapping of redo log logical format to physical format

The figure above shows the mapping from logical structure to physical structure with specified innodb_log_files_in_group as 2.

In the image above, the upper layer is the logical structure of Redolog, which can be regarded as the log buffer in memory, and the lower layer is the actual physical file storage of redo Log. Due to the layout, we take innodb_log_files_in_group as 2 as an example. Each log file contains only 2 log block (the number of Log block depends on the set innodb_log_file_size).

Every time a mtr record is generated, it will be append to log buffer, and when the log buffer disk is down, it will get a fixed size of data and write it to the data field of block. Of course, if the remaining data in the buffer is not enough to fill the data field of a block, it will also be written to a new block, the insufficient data will automatically be uneven, and the data length field in the block header will indicate the amount of valid data.

2.6. Introduction to MTR

MTR is the abbreviation of Mini-transaction, which literally means small things. Compared with logical things, we call them physical things. Belongs to the underlying module of the Innodb storage engine. Mainly used for locks and log information. The upper module within InnoDB converts transaction operations into several MTR physical transactions. As for how the upper-level transaction operations are converted into MTR this operation will be described separately later. This article only introduces the format of MTR records.

Each MTR operation produces a MTR Record, and we'll talk about the format of the MTR record in the next section.

2.7. MTR record format

In colloquial terms, a MTR record indicates what value has been made to an offset within which page of which data file (space id) and which (page).

(common format for a MTR record)

Type:

Type of MTR record

Space ID:

Which data file was modified by the MTR record

Page Number

Which page was modified by the MTR record

Record Payload:

Depending on the Type, Payload content varies in format and size. A storage structure in which Type is MLOG_COMP_REC_INSERT is given later.

(MTR Type-MLOG_COMP_REC_INSERT)

Finally, some MTR Record Type are listed. The reader should be able to see the meaning of this Type from the name.

/ one byte is written * /

MLOG_1BYTE = 1

/ * 2 bytes. /

MLOG_2BYTES = 2

4 bytes. * /

MLOG_4BYTES = 4

8 bytes. * /

MLOG_8BYTES = 8

/ * Record insert /

MLOG_REC_INSERT = 9

/ Mark clustered index record deleted * /

MLOG_REC_CLUST_DELETE_MARK = 10

/ Mark secondary index record deleted * /

MLOG_REC_SEC_DELETE_MARK = 11

/ * update of a record, preserves record field sizes /

MLOG_REC_UPDATE_IN_PLACE = 13

/! < Delete a record from a page /

MLOG_REC_DELETE = 14

/ Delete record list end on index page * /

MLOG_LIST_END_DELETE = 15

/ Delete record list start on index page * /

MLOG_LIST_START_DELETE = 16

/ * Copy record list end to a new created index page /

MLOG_LIST_END_COPY_CREATED = 17

/ Reorganize an index page in ROW_FORMAT=REDUNDANT * /

MLOG_PAGE_REORGANIZE = 18

/ Create an index page * /

MLOG_PAGE_CREATE = 19

/ * mark a compact index record as the predefined minimum record /

MLOG_COMP_REC_MIN_MARK = 36

/ create a compact index page * /

MLOG_COMP_PAGE_CREATE = 37

/ * compact record insert /

MLOG_COMP_REC_INSERT = 38

/ * mark compact clustered index record deleted /

MLOG_COMP_REC_CLUST_DELETE_MARK = 39

For the above redo log for MySQL Crash Recovey (crash recovery) analysis, if you need to know more, you can continue to pay attention to the innovation of our industry, if you need to get professional answers, you can contact the pre-sales and after-sales on the official website. I hope this article can bring you some knowledge updates.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.