What is the principle of innodb index in mysql? 07/09 Update SLTechnology News&Howtos

What is the principle of innodb index in mysql?

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces what is the principle of innodb index in mysql, which has certain reference value and can be used for reference by friends who need it. I hope you will learn a lot after reading this article. Next, let the editor take you to learn about it.

Clustered index (clustered index)

The innodb storage engine table is an index organization table, and the data in the table is stored in the primary key order. Its clustered index is to construct a B + tree according to the primary key order of each table, and its leaf nodes store the row record data of the whole table, and these leaf nodes become data pages. (recommended: MySQL tutorial)

The storage of clustered index is not physically continuous, but logically continuous. Leaf nodes are sorted according to the primary key order and connected by two-way linked lists. In most cases, the query optimizer tends to use the clustered index, because the clustered index can find the data directly at the leaf node, and because it defines the logical order of the data, it can quickly access the query for range values.

This feature of the clustered index determines that the data in the index organization table is also part of the index. Because the data in the table can only be sorted by one B + tree, a table can have only one clustered index.

In Innodb, a clustered index defaults to the primary key index. If there is no primary key, follow the following rules to build a clustered index:

When there is no primary key, a non-empty and unique index column is used as the primary key to become the clustered index of the table; if there is no such index, InnoDB implicitly defines a primary key as the clustered index.

Because the primary key uses a clustered index, if the primary key is self-incrementing id, then the corresponding data will also be stored on disk adjacent to each other, resulting in higher write performance. If it is in the form of a string such as uuid, frequent insertions will cause innodb to move disk blocks frequently, resulting in poor write performance.

B + tree (multi-path balanced search tree)

We know that the innodb engine index uses a B + tree structure, so why not other types of tree structures, such as binary trees?

When the computer stores data, it has the smallest storage unit, which is like the smallest unit of RMB circulation. The smallest unit of a file system is a block, and the size of a block is 4k (this value varies according to the system and can be set). The InnoDB storage engine also has its own minimum storage unit-Page, and the size of a page is 16K (this value is also settable).

The size of a file in a file system is only 1 byte, but it has to take up 4KB space on disk. Similarly, the size of all data files for innodb is always an integral multiple of 16384 (16k).

So in MySQL, a block node that stores the index occupies 16k.MySQL will load 16K at a time using the pre-read ability of the system for each IO operation. In this way, it is very wasteful to put only one index value in this node, because IO can only get one index value at a time, so you can't use a binary tree.

B + tree is a multi-path search tree, a node can put n values, n = 16K / the size of each index value.

For example, the index field size 1Kb, when each node can put 16 index values, in this case, the binary tree can only load one index value at a time, while the B + tree can load 16 index values.

The number of paths of a B+ tree is the number of values that exist in each node. For example, if 16 values are stored in each node, then the tree is 17 paths.

It can also be seen here that the B+ tree node can store multiple values, so the B+ tree index cannot find a specific row with a given key value. The B+ tree can only find the specific page where the data row is stored, then read the page into memory, and then find the specified data in memory.

PS: the difference between B-tree and B + tree is that the non-leaf nodes of B + tree only contain navigation information and do not contain actual values. All leaf nodes and connected nodes are connected by linked lists, which is convenient for interval search and traversal.

Auxiliary index

Also known as a nonclustered index, its leaf node does not contain all the data recorded in the row. In addition to the key value, the index row in each leaf node also contains a bookmark, which is the clustered index key of the corresponding row.

The following figure shows the relationship between the secondary index and the clustered index (the picture is from the network, just see the general meaning):

When looking for data through a secondary index, the innodb storage engine obtains the primary key that only wants the primary key index through the secondary index leaf node, and then finds the complete row record through the primary key index.

For example, if you look for data in a secondary index tree with a height of 3, you need to IO the secondary index tree three times to find the specified primary key. If the height of the clustered index tree is also 3, then you need to search the clustered index tree three times to find the page where a complete row of data is located, so a total of 6 IO visits are needed to get the final data page.

Indexes created, such as federated indexes, unique indexes, and so on, are non-clustered indexes.

Joint index

A federated index refers to indexing multiple columns on a table. The federated index is also a B + tree, except that the number of keys of the federated index is not 1, but greater than or equal to 2.

For example, there is a user table with a field of id,age,name. It is found that the following two sql are most frequently used:

Select * from user where age =? Select * from user where age =? And name =?

At this point, there is no need to create two separate indexes for age and name, just a joint index as follows:

Create index idx_age_name on user (age, name)

Another benefit of federated indexes is that the second key value has been sorted, sometimes avoiding one more sort operation.

Overlay index

Override the index, that is, you can get all the field values needed for the query from the secondary index without querying the records in the clustered index. The advantage of overriding an index is that the secondary index does not contain all the information for the entire row of records, so its size is much smaller than the clustered index, thus reducing a large number of IO operations.

For example, there is a federated index (age,name) above, if the following:

Select age,name from user where age=?

You can use the override index.

Another benefit of overriding an index is for statistical problems, such as:

Select count (*) from user

The innodb storage engine does not choose to query the clustered index for statistics. Because there is a secondary index on the user table, and the secondary index is much smaller than the clustered index, selecting the secondary index can reduce IO operations.

Note that the index is only appropriate, not redundant, because the B+ tree has to be adjusted whenever data is added or deleted. If multiple indexes are built, multiple Btrees will be adjusted, and the more trees and the larger the structure, the more time-consuming and resource-consuming this adjustment will be. If these unnecessary indexes are reduced, disk utilization may be greatly reduced. The data length of an index column can be as small as possible.

The smaller the length of the index data, the more indexes are stored in each block and the more values are fetched at a time by IO.

Matching column prefixes can be used for indexes like 9999% like% 9999, like% 9999 do not use indexes; in and or can use indexes in Where conditions, while not in and operations cannot use indexes

If it's not in or, facing the Btree, the engine has no idea which node to start with.

For matching range values, order by can also use indexes; use more specified column queries to return only the data columns you think of, and use less select *

There is no need to query useless fields and do not use * it may also hit the overwrite index

An index cannot be used in a federated index if it does not start searching according to the leftmost column of the index.

Leftmost matching principle

An index can be used to exactly match the leftmost front column and range match in a federated index. If there is a range query of a column in the joint index, then all the columns on the right cannot be used. Thank you for reading this article carefully. I hope the editor will share the principle of innodb index in mysql what is helpful to everyone. At the same time, I also hope that you will support us, pay attention to the industry information channel, and find a detailed solution waiting for you to learn.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.