Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Demonstrate the indexing principle of Mysql with pictures and texts

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

This article is mainly about demonstrating the indexing principle of Mysql with pictures and text. If you are interested, let's take a look at this article. I believe it is of some reference value for everyone to demonstrate the indexing principle of Mysql with pictures and texts.

First, the common index in the data structure [students who know this data structure are advised to skip this section] 1. Binary tree

Speaking of binary trees, we all know that each node can only have at most two child nodes, for example:

It can be found that the binary tree is very regular, the left child node is smaller than the current node, and the right child node is larger than the current node. So isn't it convenient to inquire? The nature of the binary tree determines that its time complexity is Olog (n). Of course, the time complexity of the binary tree has the same time complexity as its insertion order. If the data is inserted in ascending or descending order, then the height h of the binary tree is equal to the number of nodes, and the complexity is increased to O (n).

If the database uses a binary tree for indexing, and 1000 pieces of data need to be inserted, let's calculate the height of the tree. (a binary tree with a depth of k has at least k nodes and a maximum of 2 ^ k-1 nodes)

2 ^ 10-1 ≈ 1000 the height of the tree is about 10 in the worst case, and the height of the tree is 1000.

The height of the tree determines the efficiency of the query, and the binary tree will have a height gap of 10-1000, so it is obviously no longer suitable for our index!

two。 Balanced tree

The problem has been put out before. the height of the binary tree is very unstable, so can we stabilize the height? This is the balanced tree, which dynamically adjusts the height of the binary tree according to the insertion (the difference between the height of the left and right subtrees is at most 1). For example, we insert from 10, 9, 8, and 1

Look, I am not lying to you, it will adjust the height of the tree according to the insertion situation, how to adjust, I will only briefly explain it, after all, it is not the focus of this article.

The adjustment of the balance tree can be divided into four situations:

LL 、 LR 、 RL 、 RR

This situation is easy to understand.

In this case, the 5 and 6 nodes are rotated first to the left, then to the LL type, and then to the right.

Well, let's not talk about the other two, which is just the opposite of the rotation of these two.

Ahem, back to the point, now that the balanced tree is enough to ensure the balance of the tree, is there a problem with using it for indexing?

If we have 100000 records, then according to the nature of the binary tree, the minimum height of the tree is about 2 ^ 16, that is, it takes 16 times to find an element, and some students may say, well, I can accept 16 queries. What if we insert or delete data? The biggest disadvantage of AVL tree is that when inserting nodes, the tree needs to rotate and adjust nodes frequently, so balanced tree is not suitable for indexing.

3. Red and black tree

I mentioned the requirements of the balanced tree for the binary tree. The height of the left and right subtrees is at most 1. A little carelessness in inserting data will result in the conversion operation of the balanced tree, which is very time-consuming.

The emergence of the red-black tree is to avoid the frequent transformation of nodes in the balanced tree. The red-black tree does not pursue complete balance, it only requires some nodes to achieve balance, which reduces the requirement of rotation, thus improving the performance. Let's take a look at the definition of red-black tree first.

* each node is either red or black. * the root node is black. * every leaf node (that is, the NIL pointer or NULL node at the end of the tree) is black. * if a node is red, then its two sons are black. * for any node, each path to the NIL pointer at the end of the leaf node tree contains the same number of black nodes.

When inserting or deleting elements, it is also necessary to maintain the red-black tree structure, and the reason why the index does not use the red-black tree is mainly when the binary tree saves a large number of nodes, which will lead to the continuous increase of the height of the tree. For example, with 100 million nodes, the height of the tree will reach about 27 layers, and the general index is saved to disk. If one IO is required for each query, that is, 27 IO operations are required, and each IO operation is very time-consuming.

4.B tree

Balanced trees and red-black trees are binary trees. When binary trees store a large number of elements, it will lead to a continuous increase in the height of the tree. Can multi-tree be used?

Let's first take a look at the definition of B-tree:

1. The composition keyword of the B-tree (which can be understood as data) points to the pointer to the child node.

2. The properties of B-tree: * if the root node is not a terminal node, then there are at least 2 sub-trees * all non-leaf nodes except the root node have at least 2 M sub-trees and at most M sub-trees (the number of keywords is minus one) * all leaf nodes are on the same layer 5.B + tree.

The main difference between B + tree and B tree is:

* the number of subtrees and keywords of the node is the same (B tree is one less than the number of subtrees) * the keyword of the node represents the maximum number of subtrees and also contains this data in the subtree * the leaf node contains all the data. At the same time, it accords with the order of left small and right big.

Compared with B-tree, B + tree has the advantages of lower level, linked list of leaf nodes and convenient range query.

B tree and B + tree in Mysql. Principle of disk reading

Index files are generally stored on the disk in the form of files, and the index retrieval operation requires the IO of the disk, but the efficiency of disk sequential reading is very high, so when reading, the disk is often not read on demand, and will be pre-read every time, even if only one byte is needed, the disk will start from this location and sequentially read a certain length of data back into memory. Because of the high efficiency of sequential disk reading, pre-reading can improve the efficiency of IO for programs with locality. The length of pre-reading is generally an integral multiple of the page (a page is a logical block of computer-managed memory, and the operating system often divides the main memory and disk storage into continuous blocks of equal size, each block is called a page. The size is usually 4K) main memory and disk exchange data on a page-by-page basis. When the data to be read by the program is not in main memory, a page fault exception will be triggered. At this time, the system will send a read signal to the disk. The disk will find the starting position of the data and continuously read one or more pages back into memory. Then the exception returns, and the program continues to run.

B+ Tree in 2.Innodb

The B + tree is used as the index in Innodb, such as the primary index in the following figure:

The leaf node contains all the nodes. Except for the leaf node, the other nodes do not contain values, while the leaf nodes contain specific values.

Secondary index

The secondary index in Innodb is also a B + tree, but its leaf node points to the primary key value in the primary index, and then looks for the specific value in the primary index.

B+ Tree in 3.myISAM

When the MyISAM engine uses the B+ tree as the index, the structure is roughly the same as that of Innodb, except that the leaf node stores not the specific record value, but the address of the record.

Secondary index

In the first-level index, the MyISAM index file only stores the address of the data record, while the secondary index has no difference in structure, and the secondary index also points directly to the address of the record.

The above details about using pictures and text to demonstrate the indexing principle of Mysql, are they helpful to you? If you want to know more about it, you can continue to follow our industry information section.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report