In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Editor to share with you the insertion buffer method of InnoDB, I hope you will learn a lot after reading this article, let's discuss it together!
The InnoDB engine has several key features that give it better performance and reliability:
Insert buffer (Insert Buffer) write twice (Double Write) Adaptive Hash Index (Adaptive Hash Index) Asynchronous IO (Async IO) refresh adjacent pages (Flush Neighbor Page)
Today our topic is Insert Buffer, because the underlying data of the InnoDB engine stores structured B + trees, while for indexes we have clustered and nonclustered indexes.
When inserting data, it will inevitably cause changes in the index. It is needless to say that clustered indexes are generally incremental and orderly. On the other hand, the non-clustered index is not necessarily any data, and its discreteness leads to the continuous change of the structure at the time of insertion, which leads to the degradation of insertion performance.
So in order to solve the problem of nonclustered index insertion performance, the InnoDB engine created Insert Buffer.
Storage of Insert Buffer
If you look at the figure above, you might think that Insert Buffer is part of the InnoDB buffer pool.
* * key point: * * in fact, the InnoDB buffer pool does contain the information of Insert Buffer, but Insert Buffer, like data pages, also exists physically (in the form of a B+ tree in the shared tablespace).
The role of Insert Buffer
Let's start with a few points:
A table can have only one primary key index because its physical storage is a B + tree. (don't forget to aggregate the data stored by the index leaf node, but there is only one copy of the data.)
The nonclustered index leaf node stores the primary key of the clustered index.
Insertion of clustered index
First of all, we know that in the InnoDB storage engine, the primary key is the row unique identifier (that is, the clustered index we often talk about). We usually insert data incrementally according to the primary key, so the clustered index is sequential and does not need to be randomly read from the disk.
Such as the table:
CREATE TABLE test (id INT AUTO_INCREMENT, name VARCHAR (30), PRIMARY KEY (id)); copy code
As mentioned above, I created a primary key id, which has the following features:
Id column is a self-growing Id column when a null value is inserted, its value is incremented because of AUTO_INCREMENT, and the row records in the data page are stored in the order of id values.
In general, because of the order of the clustered index, there is no need to randomly read the data in the page, because the sequential insertion speed of this kind is very fast.
But if you insert column Id into data like UUID, your insertion is as random as a nonclustered index. It will cause your B + tree structure to change constantly, and the performance will inevitably be affected.
Insertion of nonclustered index
Many times our table also has a lot of nonclustered indexes. For example, I query by b field, and b field is not unique. The table is as follows:
CREATE TABLE test (id INT AUTO_INCREMENT, name VARCHAR (30), PRIMARY KEY (id), KEY (name)); copy code
Here I have created an x table that has the following characteristics:
There is a clustered index id that has a non-unique nonclustered index name. When inserting data, the data pages are stored in the order of the primary key id. The data insertion of the secondary index name is not sequential.
A nonclustered index is also a B + tree, except that the leaf node stores the primary key of the clustered index and the value of name.
Because there is no guarantee that the data in the name column is sequential, the insertion of the nonclustered index tree must not be sequential either.
Of course, if the name column inserts time type data, then the insertion of its nonclustered index is also sequential.
The arrival of Insert Buffer
It can be seen that the discreteness of nonclustered index inserts leads to the decline of insert performance, so the InnoDB engine designs Insert Buffer to improve insert performance.
Let me see how it is plugged in using Insert Buffer:
First of all, for the insert or update operation of a nonclustered index, instead of inserting directly into the index page every time, it is first determined whether the inserted nonclustered index page is in the buffer pool.
If it is, insert it directly; if not, put it into an Insert Buffer object first.
It feels like the tree has been inserted into the leaf node of the non-clustered index, but it is actually stored in another location.
The merge (merge) operations of Insert Buffer and secondary index page child nodes are carried out at a certain frequency and situation, and multiple insert operations are usually merge together, which greatly improves the insertion performance of nonclustered indexes.
Requirements for the use of Insert Buffer: indexes are nonclustered indexes are not unique (unique)
The InnoDB storage engine uses Insert Buffer to improve insert performance only if the above two necessary conditions are met.
Then why must the above two conditions be met?
The first point is that the index is nonclustered, needless to say, other people's clustered indexes are sequential and do not need you.
The second point must not be unique, because the database does not determine the uniqueness of the inserted record when writing to the Insert Buffer. If you look again, it must be a discrete read again, so InsertBuffer loses its meaning.
Insert Buffer information view
We can use the command SHOW ENGINE INNODB STATUS to view the information about Insert Buffer:
-- INSERT BUFFER AND ADAPTIVE HASH INDEX--Ibuf: size 7545, free list len 3790, seg size 11336, 8075308 inserts,7540969 merged sec, 2246304 merges... Copy the code
After using the command, we will see a lot of information. Here we will only look at the INSERT BUFFER:
Seg size represents the size of the current Insert Buffer 11336*16KB
Free listlen represents the length of the free list
Size represents the number of pages that have been merged
Inserts represents the number of records inserted
Merged recs represents the number of inserted records merged
Merges represents the number of merges, that is, the number of pages actually read
The merges:merged recs is about 1 ∶ 3, which means that Insert Buffer reduces the discrete IO logic requests for nonclustered index pages by about 2 ∶ 3.
The problem with Insert Buffer
So much has been said about the benefits of Insert Buffer, but there is also a problem with Insert Buffer:
That is, in the case of write-intensive, inserting buffers will take up too much buffer pool memory (innodb_buffer_pool). By default, it can take up to 1 bank 2 buffer pool memory.
Taking up too much buffer pool will inevitably affect other buffer pool operations.
Optimization of Insert Buffer
Previous versions of MySQL5.5 were actually called Insert Buffer, and then optimized to Change Buffer can be seen as an updated version of Insert Buffer.
Insert Buffer is actually only buffered for INSERT operations, while Change Buffer buffers INSERT, DELETE and UPDATE, so it can be collectively referred to as write buffering, which can be divided into:
Insert Buffer
Delete Buffer
Purgebuffer
Summary:
What on earth is Insert Buffer?
In fact, the data structure of Insert Buffer is a B + tree.
Prior to MySQL 4.1, there was one Insert Buffer B + tree per table.
The current version is that there is only one Insert Buffer B + tree globally, which is responsible for Insert Buffer the secondary indexes of all tables.
The B + tree is stored in the shared tablespace ibdata1
Insert Buffer writes to a true nonclustered index, known as Merge Insert Buffer, in the following cases
When a secondary index page is read into the buffer pool, the Insert Buffer Bitmap page traces that the secondary index page has no free space, and the Master Thread thread performs an Merge Insert Buffer operation every second or every 10 seconds.
In a word, to sum up:
Insert Buffer is used to improve the insertion performance of nonclustered index pages. Its data structure is similar to a B + tree of data pages and is physically stored in the shared tablespace ibdata1.
After reading this article, I believe you have a certain understanding of the insertion buffer method of InnoDB, want to know more about it, welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 215
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.