How to understand the underlying data structure of MySQL index 07/06 Update SLTechnology News&Howtos

How to understand the underlying data structure of MySQL index

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "how to understand the underlying data structure of MySQL index". In daily operation, I believe many people have doubts about how to understand the underlying data structure of MySQL index. the editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the doubt of "how to understand the underlying data structure of MySQL index". Next, please follow the editor to study!

I. Index type

1. B+ tree

Why the B + tree instead of the B tree?

First of all, take a look at the structural difference between B-tree and B + tree.

B-tree structure:

B+ tree:

You can see:

The B-tree has satellite data on each node (a row of data in the data table), while the B + tree has satellite data only on the leaf nodes. This means that with the same size of disk sector, B + tree can store more leaf nodes and fewer disk IO times; it also means that the search efficiency of B + tree is more stable, and the fastest time complexity of B-tree data query is O (1).

Each node of the B-tree appears only once, and all the nodes of the B + tree appear in the leaf node. All the leaf nodes of B + tree form an ascending linked list, which is suitable for interval range search, while B tree is not suitable.

What is the difference between the Btree index implementation of 2.MyISAM and InnoDB (clustered index and non-clustered index)?

First of all, you need to understand clustered and non-clustered indexes.

Clustering Index:

In a clustered index, the leaf page contains all the data of the row, and the node page value contains index columns. InnoDB aggregates data through the primary key and selects a unique non-empty index column instead if there is no primary key; if there is no such index, InnoDB implicitly defines a primary key as the clustered index.

Data distribution of clustered indexes:

In a clustered index, in addition to the primary key index, there are secondary indexes. The leaf node in the secondary index stores not a row pointer, but a primary key value, which is used as a "pointer" to the row. This means that through the secondary index to find rows, the storage engine needs to find the leaf node of the secondary index to get the corresponding primary key value, and then look up the corresponding rows in the cluster index according to this value, also known as "back to the table". Of course, you can avoid returning to the table by overwriting the index or InnoDB's adaptive index can reduce such repetitive work.

Note: each leaf node in the clustered index contains not only the complete row of data, but also the transaction ID, rollback pointers for transactions and MVCC.

3. Non-clustered index

The primary key index of a non-clustered index is no different from the secondary index in structure, and both store "row pointers" to the physical address of the data on the leaf node.

Primary key index and secondary index of clustered index:

Primary key index and secondary index of non-clustered index:

4. Advantages and disadvantages of clustering index

Advantages:

Save the relevant data together (for example, using the user ID to aggregate all the user's messages), otherwise each data read may result in a disk IO

Data access is faster, keeping the index and data in the same B+ tree, and it is usually faster to get data in a clustered index than to find it in a non-clustered index

Using override queries, you can directly take advantage of the primary key values in the page node

Disadvantages:

If all data can be stored in memory, sequential access is no longer necessary, and clustered indexes have no advantage.

Insert speed depends on the insertion order, random insertion will cause the page to split, resulting in holes, using OPTIMIZE TABLE to rebuild the table

Every insert, update and delete needs to maintain the changes of the index, which is very expensive.

The secondary index may be larger than expected because the node contains the primary key column that references the row

5. Hash indexing

Hash indexing is based on hash table implementation, and only queries that exactly match all columns of the index are valid, which means that hash indexing is suitable for equivalent queries.

Specific implementation: for each row of data, the storage engine calculates a hash code for all index columns, and the hash index stores all the hash codes in the index, while storing a pointer to each data row in the hash table.

In MySQL, only the Memory engine explicitly supports hash indexes, and of course the Memory engine also supports B-tree indexes.

Note: the Memory engine supports non-unique hash indexes, and conflicts are resolved by storing multiple record pointers with the same hash value in the form of a linked list.

6. Adaptive hash indexing

InnoDB notes that when some index values are used very frequently, a hash index is created on top of the in-memory B+ tree index, so that the B+ tree index also has some of the advantages of a hash index, such as fast hash lookup.

At this point, the study on "how to understand the underlying data structure of MySQL index" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.