In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article will explain in detail about Mysql Innodb storage engine index and algorithm example analysis, Xiaobian think quite practical, so share to you as a reference, I hope you can read this article after harvest.
I. Overview
Too few indexes, query efficiency is low; too many indexes affect program performance, the use of indexes should fit the actual situation.
Innodb supports indexes including:
Full-text search using inverted indexing
Hash index, adaptive, can not be human intervention, according to the buffer pool aggregated index page creation, and will not hash the entire table index, so the index is very fast.
B+ tree index, the traditional index, is currently the most effective and commonly used index in relational databases.
The B+ tree does not locate the specific row record on the table, but returns the page where the row record is located; finally, it accurately locates in memory according to the slot information and the next record information in the row record header.
2. Data structure and algorithm 1. Binary search
Binary search can only be used to search for an ordered set of linear data, taking the median at a time, small forward, large backward. Time complexity: log N, as shown in the figure for the number 48 in the ordered array.
2. Binary search tree and balanced binary tree 1) Binary search tree
Binary search tree refers to, a binary tree, all meet: any node left child node is smaller than its own, any node right child node is larger than its own binary tree, that is, binary search tree.
A regular binary tree cannot guarantee an O(logN) access time, because in extreme cases it can even degenerate into a linked list.
When a binary tree is constructed from a set of ordered data, then a linked list is obtained, and the time complexity becomes O(N).
2) Balanced binary tree
Balanced binary tree is a binary search tree, but it has one more constraint: the tree height difference between two child nodes of any node cannot exceed 1. If this condition is violated during the construction of the binary tree, it can be solved by appropriate rotation.
Time complexity: O(logN)
Although it can guarantee O(logN) access time, it is not suitable for database indexing:
Binary tree height climbs very fast (1024 = 2 to the 10th power), and log(N) is also very considerable when the data volume is very large.
The worst part is that the leaf node of the binary tree can only store one data, and must perform multiple disk IO. In practice, however, frequent disk reads can be disastrous compared to CPU execution time. Therefore, binary trees are not suitable for database indexing.
For mechanical hard disks, the access time depends on the disk speed and head movement time, which are done by mechanical structure, compared with the electrical signal instructions executed in the cpu, the speed must be very different.
10 million data, if you use a balanced binary tree (the worst time bound is 1.44 * logN), even if you do not take the worst time bound, the final calculation by log(N) is about 24, then it means that you need to perform 24 disk IOs, which is obviously not possible.
[The tree height is rounded up to the logarithmic value, for example: log3 = 1.58, tree height is 2;]
B+ tree
Because of the limitations of balanced binary trees, B+ trees need to be introduced.
A B+ tree is a balanced lookup tree designed specifically for disks or other direct access auxiliary devices. In a B+ tree, all record nodes are key-sized, stored sequentially on leaf nodes at the same level, and linked by pointers to leaf nodes.
B+ tree complete definition
A B+ tree of order M needs to satisfy the following properties:
All of the following definitions refer to dividing two numbers, rounding up when they are not divisible, rather than discarding decimal places. (except for deductive inequalities in cases)
1) Data items must exist on leaf nodes
2) A non-leaf node stores M-1 keywords to indicate the search direction; keyword i represents the smallest keyword in the i +1th subtree of the non-leaf node; assuming a B+ tree of order 5, it has 5 - 1 = 4 keywords.
3) A B+ tree either has only one leaf node as root node (without any children); if it has children nodes, its number of nodes must belong to the set: {2~M};
4) Except for roots, the number of children of all non-leaf nodes must satisfy the set: { M/2, M } ;
5) All leaves are at the same depth, and the number of data items of leaf nodes must belong to the set: { L/2, L } ;
Selected cases of M and L
The following table is an example, simulation deduction B+ tree, primary key 50 bytes, calculate the space consumed by the uplink record itself, assuming that the total length of all fields does not exceed 500 bytes:
It is known that all row records consume some bytes of row information: variable length fields, row record headers, transaction IDs, rollback pointers, and so on.
create table context( id varchar(50) primary key, name varchar(50) not null, description varchar(360) );
A leaf node represents a data page, and the choice of M and L values is closely related to it, assuming that the data page size is: P/byte (in MySQL discussed in this article, for example, a data page size is 16K, that is, 16384 bytes).
On non-leaf nodes: the key of B+ tree is the primary key. In this example, it is assumed that the primary key is 50 bytes, and the key of M order B+ tree is M-1, occupying 50 *(M - 1) bytes of space;
Add in its branch pointers to M child nodes, assuming that each branch pointer occupies 4 bytes of storage; then all space consumed in a non-leaf node totals: 50 *(M - 1)+ 4 * M = 54M - 50 bytes.
When MySQL is used and the primary key is assumed to be 50 bytes, the inequality holds: 54M - 50
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.