The underlying principle of InnoDB 07/06 Update SLTechnology News&Howtos

The underlying principle of InnoDB

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article focuses on "the underlying principles of InnoDB". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn the basic principles of InnoDB.

Storage engine

Many articles start directly with which storage engines are available, without introducing the storage engine itself. So what exactly is a storage engine? Have you ever thought about how MySQL stores the data we throw in?

In fact, the storage engine is also very simple. I think it is a storage solution that implements the functions of adding data, updating data, indexing and so on.

What are the existing storage engines for us to choose from?

InnoDB 、 MyISAM 、 Memory 、 CSV 、 Archive 、 Blackhole 、 Merge 、 Federated 、 Example

There are many kinds of storage engines, but at present, the only commonly used storage engines are InnoDB and MyISAM, which I will focus on.

InnoDB is currently the most widely used MySQL storage engine, and MySQL has been the default storage engine since version 5. 5 InnoDB. Do you know why InnoDB is so widely used? Putting this question aside, let's take a look at the underlying principles of the InnoDB storage engine.

The memory architecture of InnoDB is mainly divided into three blocks, buffer pool (Buffer Pool), redo buffer pool (Redo Log Buffer) and extra memory pool.

Buffer pool

InnoDB stores the data on disk in order to persist the data. However, in the face of a large number of requests, the gap between the processing speed of CPU and the IO speed of disk is too big. In order to improve the overall efficiency, InnoDB introduces buffer pool.

When there is a request to query the data, if it is not in the cache pool, it will look in the disk and put the matching data into the cache pool. Similarly, if there is a request to modify the data, MySQL will not directly modify the disk, but will modify the data already in the page of the buffer pool, and then brush the data back to disk, which is the role of the buffer pool, speed up reading, speed up writing, and reduce IO interaction with the disk.

To put it bluntly, the buffer pool is to throw the data from the disk into memory, so since it is memory, there will be no memory space to allocate. So the buffer pool uses the LRU algorithm to eliminate pages when there are no free pages in the buffer pool. But using this algorithm will bring a problem called buffer pool pollution.

When you are doing a batch scan or even a full table scan, you may replace all the hot pages in the buffer pool. This may lead to a cliff degradation of MySQL performance. So InnoDB made some optimizations to LRU to avoid this problem.

MySQL uses log first. Before actually writing data, it will first record a log, called Redo Log, and periodically use CheckPoint technology to brush the new Redo Log into disk, which will be discussed later.

In addition to the data, there are index pages, Undo pages, insert buffers, adaptive hash indexes, InnoDB lock information, and data dictionaries. Let's choose a few more important ones to have a brief chat.

Insert buffer

The operation of the insert buffer is to update or insert, and we consider the worst case, that is, the data that needs to be updated is not in the buffer pool. Then there will be the following two options at this time.

Write a piece of data directly to the disk.

Wait until the data reaches a certain threshold (for example, 50 pieces) before batch writing to disk.

Obviously, the second option is better and reduces interaction with disk IO.

Write twice

Since we're all talking about insertion buffering, I have to write twice, because I think the features of the two InnoDB complement each other.

Inserting buffering improves the performance of MySQL, while writing twice improves the reliability of data. We know that when the data is still in the buffer pool, when the machine goes down, a write failure occurs and there is Redo Log to recover. But what if there is an outage when data is flushed back to disk from the buffer pool?

This situation is called partial write failure, and redoing the log will not solve the problem.

The picture comes from the network, invading and deleting.

When scrubbing dirty pages, instead of brushing them directly into the disk, copy them to the Doublewrite Buffer in memory, then copy them to the disk shared tablespace (you can understand it as disk), write 1m each time, and wait for copy to complete, then write the pages in Doublewrite Buffer to the disk file.

With the write mechanism twice, even if the downtime occurs while brushing dirty pages, when the instance is restored, you can find a copy of the Doublewrite Buffer page from the shared tablespace and directly overwrite the original data page.

Adaptive hash indexing

Adaptive index is the same as JVM will dynamically compile some hot code into Machine Code. InnoDB will monitor the query of all indexes and establish a hash index on the pages accessed by hot spots, so as to improve access speed.

You may have seen a keyword page many times, so let's talk about what the page is.

Page

Page, which is the smallest unit of data management in InnoDB. When we query the data, it loads the data from the disk into the buffer pool in pages. By the same token, updating data is also on a page-by-page basis, brushing our changes to the data back to disk. The default size of each page is 16k, each page contains several rows of data, and the structure of the page is shown in the following figure.

The picture comes from the network, invading and deleting.

Don't worry too much about what each district is for, we just need to know what the benefits of this design are. The data of each page can be passed through the data of the previous page and the next page in FileHeader, and a two-way linked list can be formed between pages. Because in the actual physical storage, the data is not continuously stored. You can think of it as the distribution of G1 Region in memory.

For the row data contained in a page, an one-way linked list is formed between rows. The row data we store ends up in User Records, although User Records doesn't take up any storage space at first. As we store more and more data, the User Records will become larger and larger, and the Free Space space will become smaller and smaller, until we run out of data, we will apply for a new data page.

The data in User Records is sorted by the primary key id, and when we search by the primary key, we look all the way back along this one-way linked list.

Redo log buffering

As mentioned above, the page data update in the buffer pool in InnoDB will precede the disk data update, and InnoDB will also use the log first (Write Ahead Log) strategy to refresh the data. When the transaction starts, the Redo Log is logged to Redo Log Buffer before the buffer pool page data is updated.

The data in Redo Log Buffer is written to the redo log at a certain frequency. The changed pages will be marked as dirty pages, and InnoDB will brush the dirty pages to disk according to the CheckPoint mechanism.

Journal

The above mentioned Redo log, this section is devoted to the log, the log is divided into the following two dimensions.

MySQL level

InnoDB level

MySQL log

MySQL log can be divided into error log, binary file, query log and full query log.

The error log is easy to understand, which is the log of serious errors that occur during the operation of the service. When our database fails to start, we can come here to see the specific reason why it cannot be started.

The binary file, which has another name you should be familiar with, is Binlog, which records all changes to the database.

The query log records all statements from the client

The slow query log records all the SQL statements whose response time exceeds the threshold. We can set this threshold by ourselves. The parameter is long_query_time, and its default value is 10s. It is closed by default and needs to be opened manually.

InnoDB log

There are only two kinds of InnoDB logs, Redo Log and Undo Log

The Redo Log redo log is used to record changes in transaction operations and records the modified values. The transaction is recorded regardless of whether the transaction is committed or not. For example, when updating data, the updated record is first written to Redo Log, and then the data in the page in the cache is updated. Then, according to the set update policy, the data in memory is brushed back to disk.

Undo Log records a version of the recorded transaction before it starts and can be used for rollbacks that occur after a transaction fails.

Redo Log records changes on a specific data page and can only be used in the current Server, while Binlog can be understood to be used by other types of storage engines. This is also an important role of Binlog, that is, master-slave replication, and another role is data recovery.

As mentioned above, all changes to the database are recorded in Binlog, and there are three formats for logging. They are Statement, Row and MixedLevel.

Statement records all the SQL that will modify the data, which only records the SQL, and does not need to record all the rows affected by this SQL, reducing log volume and improving performance. However, because only recording the execution statement can not guarantee the correct execution on the Slave node, it is necessary to record some additional context information.

Row only keeps modified records, and compared with Statement, which only records the execution of SQL, Row generates a large number of logs. But Row doesn't have to record context information, it just needs to focus on what it's changed to.

MixedLevel is a mixture of Statement and Row.

Which kind of log to use needs to be decided according to the actual situation. For example, a UPDATE statement updates a lot of data, using Statement will save more space, but relatively, Row will be more reliable.

The difference between InnoDB and MyISAM

Since MyISAM is not commonly used, I'm not going to delve into some of its underlying principles and implementations. Let's simply compare the difference between the two storage engines here. Let's describe it bit by bit.

Transaction InnoDB supports transaction, rollback, transaction security, and crash recovery. MyISAM does not support it, but the query speed is faster than InnoDB.

Primary key InnoDB stipulates that if no primary key is set, a 6-byte primary key is automatically generated, while MyISAM allows no index and primary key to exist, and the index is the address of the row.

Foreign keys InnoDB supports foreign keys, but MyISAM does not

Table locks InnoDB supports row locks and table locks, while MyISAM only supports table locks

Full-text indexing InnoDB does not support full-text indexing, but plug-ins can be used to implement the corresponding functions, while MyISAM itself supports full-text indexing.

Number of rows InnoDB when getting the number of rows, you need to scan the table. On the other hand, MyISAM saves the total number of rows in the current table, which can be read directly.

So, to sum up, MyISAM is only suitable for scenarios where queries are larger than updates. If your system queries account for the vast majority of cases (such as reporting systems), you can use MyISAM to store them. Otherwise, it is recommended to use InnoDB.

At this point, I believe that you have a deeper understanding of "the underlying principle of InnoDB", you might as well come to the actual operation! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.