What is the implementation principle of MySQL index? 07/15 Update SLTechnology News&Howtos

What is the implementation principle of MySQL index?

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

What is the principle of MySQL index implementation, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something.

MyISAM index implementation

The MyISAM engine uses B+Tree as the index structure, and the data domain of the leaf node stores the address of the data record. The following is a schematic diagram of the MyISAM index:

Here, the table has a total of three columns. Assuming that we have Col1 as the primary key, figure 8 shows the Primary key of a MyISAM table. You can see that MyISAM's index file only holds the address of the data record.

Auxiliary index

In MyISAM, there is no structural difference between the primary index and the secondary index (Secondary key), except that the primary index requires that the key is unique, while the key of the secondary index can be repeated. If we build a secondary index on Col2, the structure of the index is shown in the following figure

It is also the address where the data records are kept in the Bamboo Tree data domain. Therefore, the index retrieval algorithm in MyISAM is to first search the index according to the B+Tree search algorithm, if the specified Key exists, take out the value of its data domain, and then take the value of the data domain as the address to read the corresponding data record.

The index method of MyISAM is also called "nonclustered index". The reason for this is to distinguish it from InnoDB's clustered index.

InnoDB index implementation

Although InnoDB also uses B+Tree as the index structure, the implementation is quite different from that of MyISAM.

1. The first major difference is that InnoDB's data file itself is an index file. As you know from the above, the MyISAM index file and the data file are separate, and the index file only holds the address of the data record.

In InnoDB, the table data file itself is an index structure organized by B+Tree, and the data domain of the leaf point of this tree keeps the complete data record. The key of this index is the primary key of the data table, so the InnoDB table data file itself is the primary index.

The figure above is a schematic diagram of the InnoDB main index (which is also a data file), and you can see that the leaf node contains the complete data record. This kind of index is called clustered index. Because the InnoDB data file itself has to be aggregated by the primary key.

1. MyISAM requires that the table must have a primary key (InnoDB may not have it). If it is not explicitly specified, the MySQL system will automatically select a column that can uniquely identify the data record as the primary key. If such a column does not exist, MySQL automatically generates an implied field for the InnoDB table as the primary key, and the type is long shaping.

At the same time, try to use self-increasing fields as the primary key of the table on InnoDB. Because the InnoDB data file itself is a B+Tree, the non-monotonous primary key will cause frequent splitting and adjustment of the data file in order to maintain the characteristics of B+Tree when inserting new records, which is very inefficient, and using self-increasing fields as primary keys is a good choice. If the table uses a self-incrementing primary key, each time a new record is inserted, the record is sequentially added to the subsequent position of the current index node, and when a page is full, a new page is automatically opened. As shown in the following figure:

In this way, a compact index structure is formed, which is filled in approximate order. Because there is no need to move existing data each time you insert, it is very efficient and does not add much overhead to maintaining the index.

two。 The second difference from the MyISAM index is that InnoDB's secondary index data domain stores the value of the corresponding record primary key instead of the address. In other words, all secondary indexes of InnoDB refer to the primary key as the data domain.

For example, figure 11 is a secondary index defined on Col3:

The implementation of clustered index makes the search by primary key very efficient, but the secondary index search needs to retrieve the index twice: first retrieve the secondary index to obtain the primary key, and then use the primary key to retrieve the record in the primary index.

Extension: why not recommend using overly long fields as primary keys?

Because all secondary indexes refer to the primary index, an overly long primary index can make the secondary index too large.

Clustered index and unclustered index

InnoDB uses a clustered index to organize the primary key into a B+ tree, and the row data is stored on the leaf node. If you use the condition such as "where id = 14" to find the primary key, you can find the corresponding leaf node according to the B+ tree retrieval algorithm, and then get the row data. If you do a conditional search on the Name column, you need two steps:

The first step is to retrieve the Name in the secondary index B+ tree and get the corresponding primary key from its leaf node.

The second step is to use the primary key to perform another B+ tree retrieval operation in the primary index B+ tree species, and finally reach the leaf node to get the whole row of data.

MyISM uses a non-clustered index, and the two B + trees of the non-clustered index look no different, nodes

The structure is exactly the same, but the content stored is different. the nodes of the primary key index B + tree store the primary key and the secondary key index B + tree stores the secondary key. The table data is stored in a separate place, and the leaf nodes of both B+ trees use an address to point to the real table data, and there is no difference between the two keys for table data. Because the index tree is independent, the index tree that does not need to access the primary key is retrieved by the secondary key.

To better illustrate the difference between the two indexes, let's imagine that a table stores four rows of data as shown in the following figure. Id is the primary index and Name is the secondary index. The figure clearly shows the difference between clustered index and non-clustered index.

Joint Index and leftmost principle

Joint index storage data structure diagram:

Leftmost principle:

For example, a federated index has three index fields (Ameme BMague C)

Query criteria:

(ABI,)-can use index

(Amena B,)-can use the index

(AMagna BPRC)-can use the index

(, BQuery C)-will not use indexes

(, C)-will not use the index

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.