The difference between MYSQL B + Tree and B Tree 07/06 Update SLTechnology News&Howtos

The difference between MYSQL B + Tree and B Tree

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article introduces the knowledge of "the difference between the characteristics of MYSQL B + tree and B tree". Many people will encounter this dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

B-tree is a multi-path self-balanced search tree, which is similar to an ordinary binary tree, but B-book allows each node to have more child nodes.

The characteristics of B-tree:

(1) all key values are distributed in the whole tree.

(2) any keyword appears and only appears in one node

(3) the search may end at non-leaf nodes.

(4) do a search in the full set of keywords, and the performance is close to the binary search algorithm.

B + tree is a variant of B tree, and it is also a kind of multi-path balanced search tree.

You can also see from the figure that the difference between a B + tree and a B tree is:

(1) all keywords are stored in leaf nodes, and non-leaf nodes do not store real data

(2) A chain pointer is added to all leaf nodes.

So the question is, why use the structure of Bamp B + tree to implement the index?

A: structures such as red-black trees can also be used to implement indexes, but file systems and database systems generally use Bhand B+ tree structure to implement indexes. Mysql is a disk-based database, and the index exists on disk in the form of index files. The search process of the index will involve the consumption of disk IO (why it involves disk IO, please see the additional understanding section later in the article). The consumption of disk IO is several orders of magnitude higher than that of memory IO, so the organizational structure of the index should be designed to minimize the number of disk IO when looking for keywords. The reason for using Bash B + tree has something to do with the principle of disk storage.

Locality principle and disk pre-reading

To improve efficiency, minimize the number of disk IO. In the actual process, the disk is not strictly read on demand every time, but pre-read every time. After the disk has read the required data, it will read more of the data into memory in order, based on the locality principle stated in computer science:

When a data is used, the data near it is usually used immediately.

The data needed during the running of the program is usually relatively centralized.

(1) due to the high efficiency of disk sequential reading (no seek time, only a little rotation time)

Therefore, for local programs, pre-reading can improve the efficiency of Icano. The length of the pre-read is generally an integral multiple of the page.

(2) MySQL (using InnoDB engine by default), records are managed as pages, and each page size defaults to 16K (this value can be modified). The default page size for linux is 4K.

B-Tree uses the mechanism of computer disk pre-reading and uses the following techniques:

Each time you create a new node, you directly apply for a page of space, which ensures that a node is physically stored in a page, and that the computer storage allocation is aligned by page, so that a node needs only one Icano at a time.

It is assumed that the height of B-Tree is that one search in hray BmurTree requires no more than one search of hmurp O (root node resident memory), and the progressive complexity is O (h) = O (logdN) O (h) = O (logdN). In general practical applications, the output degree d is a very large number, usually more than 100, so h is very small (usually no more than 3, that is, the B+ tree hierarchy of the index is generally no more than three layers, so the search efficiency is very high).

On the other hand, the structure of the red-black tree is obviously much deeper. Because the logically close nodes (father and son) may be physically far away and can not make use of the locality, the asymptotic complexity of the red-black tree is also O (h), which is obviously much lower than that of B-Tree.

Why do mysql indexes use B + trees instead of B trees?

(1) B+ tree is more suitable for external storage (generally refers to disk storage), because the internal node (non-leaf node) does not store data, so a node can store more internal nodes, and each node can index a larger and more accurate range. In other words, the amount of information of using a single disk IO of B + tree is larger than that of B tree, and IO is more efficient.

(2) mysql is a relational database, which often accesses a certain index column according to the interval. The chain pointers between the leaf nodes of the B + tree are established sequentially, which enhances the interval accessibility, so the B + tree is very friendly to the interval range query on the index column. On the other hand, the key and data of each node of the B-tree are together, so it is impossible to find the interval.

This is the end of the content of "the difference between MYSQL B + tree and B tree". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.