What is InnoDB's Checkpoint technology? 04/08 Update SLTechnology News&Howtos

What is InnoDB's Checkpoint technology?

2025-04-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

What is the Checkpoint technology of InnoDB? This problem may be often seen in our daily study or work. I hope you can gain a lot from this question. The following is the reference content that the editor brings to you, let's take a look at it!

In a word, Checkpoint technology is the operation of brushing dirty pages in the cache pool back to disk at a certain time.

The problems you encountered?

We all know that buffer pools are created to bridge the gap between CPU and disk speed, so that we don't have to do disk IO operations when we read and write to the database. With the buffer pool, all page operations are done in the buffer pool first.

Such as a DML statement, when performing data update or delete operations, the records in the buffer pool page are changed. Because the data on the buffer pool page is newer than that of the disk, the page is called a dirty page.

In any case, the memory page data after the association needs to be brushed back to disk, and here are a few issues:

If each time a page changes, the version of the new page is refreshed to disk, then the overhead is very large. If the hot data is concentrated in several pages, then the performance of the database will become very poor. If downtime occurs when a new version of the page is flushed to disk from the buffer pool, then the data cannot be restored to Write Ahead Log (prepaid log).

WAL policy solves the problem of data loss caused by downtime when refreshing page data to disk. It is a series of technologies used to provide atomicity and persistence (two of the ACID attributes) in relational database systems.

The core of WAL strategy is

Redo log, whenever a transaction commits, the redo log (redo log) is written first, and the buffer pool data page is modified so that the system can continue to operate after reboot in the event of a power outage.

Principle of WAL Policy Mechanism

InnoDB maintains redo log in order to ensure that the data is not lost. Before the data page of the buffer pool is modified, the modified content needs to be recorded in redo log, and the redo log is guaranteed to be dropped earlier than the corresponding data page. This is the WAL policy.

When a failure results in memory data loss, InnoDB restarts the buffer pool data page to its pre-crash state by replaying redo log.

Checkpoint

In theory, with the WAL strategy, we can rest easy. But the problem lies in redo log:

Redo log can't be infinite, and it can't store our data endlessly, waiting to be flushed to disk together. During database idle recovery, if the redo log is too large, the cost of recovery is also very high.

Therefore, in order to solve the refresh performance of dirty pages, Checkpoint technology is used to refresh dirty pages at what time and under what circumstances.

The purpose of Checkpoint

1. Shorten the recovery time of the database

When the database is idle and restored, you do not need to redo all the log information. Because the data page before Checkpoint has been brushed back to disk. You only need the redo log after Checkpoint to recover.

2. Flush dirty pages to disk when the buffer pool is insufficient.

When the buffer pool is out of space, the least recently used page will be overflowed according to the LRU algorithm. If the page is dirty, you need to force Checkpoint to flush the dirty page, that is, the new version of the page, back to disk.

3. Refresh dirty pages when redo log is not available

As shown in the figure, redo log is not available because the current database is recycled for its design, so its space is not infinite.

When the redo log is full, because the system cannot accept updates, all update statements will be blocked.

At this point, the Checkpoint must be forcibly generated. The write pos needs to be pushed forward, and all dirty pages within the push range need to be flushed to disk.

Types of Checkpoint

The timing, conditions and selection of dirty pages of Checkpoint are very complicated.

How many dirty pages does Checkpoint refresh to disk each time?

Where does Checkpoint get dirty pages every time?

When will Checkpoint be triggered?

Faced with the above problems, the InnoDB storage engine internally provides us with two kinds of Checkpoint:

Sharp Checkpoint

Flushes all dirty pages back to disk when the database is shut down, which is the default working mode. Parameter innodb_fast_shutdown=1

Fuzzy Checkpoint

This mode is used internally in the InnoDB storage engine to refresh only part of the dirty pages instead of refreshing all the dirty pages back to disk

What happened to FuzzyCheckpoint

Master Thread Checkpoint

Refresh a certain percentage of pages back to disk from the list of dirty pages in the buffer pool at a rate of about every second or every ten seconds.

This process is asynchronous, that is, the InnoDB storage engine can perform other operations at this time, and the user query thread will not block

FLUSH_LRU_LIST Checkpoint

Because the LRU list ensures that a certain number of free pages can be used, the page is removed from the tail if it is not enough, and this Checkpoint is performed if the removed page has dirty pages.

After version 5.6, the Checkpoint is carried out in a separate Page Cleaner thread, and the user can control the number of pages available in the LRU list through the parameter innodb_lru_scan_depth, which defaults to 1024

Async/Sync Flush Checkpoint

When the redo log file is not available, you need to force some pages to be flushed back to disk, and dirty pages are selected from the list of dirty pages.

User queries will not be blocked after version 5.6

Dirty Page too much Checkpoint means that there are too many dirty pages, causing the InnoDB storage engine to force Checkpoint.

In general, the purpose is to ensure that there are enough pages available in the buffer pool.

It can be controlled by the parameter innodb_max_dirty_pages_pct, such as a value of 75, which means that CheckPoint is enforced when dirty pages account for 75% of the buffer pool.

Summary

Because of the gap between CPU and disk, buffer pool data pages appear to speed up database DML operations

Because the buffer pool data page is consistent with the disk data, the WAL strategy (the core is redo log) appears.

Checkpoint technology appears because of the refresh performance problem of dirty pages in the buffer pool.

In order to improve execution efficiency, InnoDB does not persist every DML operation with the disk. Instead, the persistence of things is guaranteed by writing the first policy to redo log through Write Ahead Log.

The buffer pool dirty pages modified in things will be flushed asynchronously, while the availability of memory free pages and redo log is guaranteed by Checkpoint technology.

Thank you for reading! After reading the above, do you have a general understanding of InnoDB's Checkpoint technology? I hope the content of the article will be helpful to all of you. If you want to know more about the relevant articles, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.