In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
What is the role of indexes in Mysql? In view of this problem, this article introduces the corresponding analysis and answers in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible way.
Common index types (implementation level)
Type of index (application level)
Clustered index and unclustered index
Overlay index
The best index usage strategy
1. Common index types (implementation level)
First of all, let's not talk about how Mysql implements the index, but with hindsight, if we are asked to design the index of the database, how to design it?
First of all, let's think about what the index wants to achieve. In fact, we want to be able to achieve the strategy of finding data quickly, so the implementation of the index is essentially a search algorithm.
But it is different from ordinary search, because our data have the following characteristics:
1. There is a lot of data stored.
two。 And constantly changing dynamically.
Therefore, these two characteristics need to be taken into account when implementing the index. We need to find the most appropriate data structure algorithm to achieve the search function.
Let's take a look at common search strategies, as shown in the following figure:
Because of the two characteristics mentioned above, we first exclude the static search algorithm.
As for the search tree, we have two options: binary tree and multi-tree:
Binary tree: if we choose binary tree, because of our large amount of data, the depth of binary tree will become very large, our index tree will become a towering tree, each query will lead to a lot of disk IO.
Multi-tree: multi-tree solves the problem of large depth of the tree, so do we choose B-tree or B + tree?
B tree is extracted from Wikipedia zh.wikipedia.org/wiki/B%2B tree.
B + tree is extracted from Wikipedia zh.wikipedia.org/wiki/B%2B tree.
From the above figure, we can see that the leaf node of the B+ tree stores all the index values, and the leaf nodes are related to each other in the form of a linked list, so we only need to traverse from the leftmost linked list to find all the values, the most common use is range lookup, while B-tree does not meet this range lookup, or the implementation is particularly complex, so Mysql finally chose to use B+ tree to achieve this function.
1.1 B-Tree Index (B+ Tree)
Just to be clear, although it is officially called the B-Tree index in Mysql, it uses the B + tree data structure.
B-tree index can accelerate the speed of accessing data, without full table scan, but search layer by layer from the root node of the index tree, and store the index value and the pointer to the next node in the root node.
Let's take a look at how the data of a single-column index is organized.
Create table User (`name` varchar (50) not null, `uid` int (4) not null, `gender` int (2) not null, key (`uid`))
The above User table creates an index on the uid column, so how does the storage engine manage the index when inserting uid (960102) into the table? Look at the index tree below
1. All index values are stored in the leaf node, and the non-leaf node value is used to locate the leaf node containing the target value more quickly.
two。 The values of leaf nodes are ordered.
3. Leaf nodes are linked in the form of a linked list
Let's take a look at how the data of multi-column (federated) indexes is organized.
Create table User (`name` varchar (50) not null, `uid` int (4) not null, `gender` int (2) not null, key (`uid`, `name`))
The federated index key (uid,name) is created on the User table, in which case his index tree is shown in the following figure.
The characteristic is the same as a single-column index, except that it is sorted by the second index field if the first field is the same
How to find data quickly through B-tree?
For the B-tree index of the InnoDb storage engine, the row data is found through the index by one step
If a clustered index (primary key) is used, the leaf node contains row data, which can be returned directly
If a non-clustered index (normal index) is used, the primary key is stored in the leaf node, and then the above clustered index is queried according to the primary key, and finally the data is returned.
For the B-tree index of the MyISAM storage engine, the row data is found through the index by one step
In addition to the index value, there is no primary key or row data stored on the leaf node of the MyISAM index tree, but a pointer to the row data is stored, according to which the data is queried in the slave table file.
1.2 Hash index (hash table)
Hash indexing is based on a hash table, and only if all columns are exactly matched can it take effect.
That is to say, suppose there is a hash index key (col1,col2), then only two fields col1 and col2 can be used each time. Because the hash index is generated according to a hash function to take the hash value of all index columns.
In the following square diagram, there is a hash index key (name)
How do we use the hash index to find it quickly when we execute mysql > select * from User where name=' Zhang San';?
The first step is to calculate the hash value, hash (Zhang San) = 1287
The second step is to locate the line number, for example, the line number corresponding to key=1287 is 3
The third step is to find the specified row and compare whether the name column value is Zhang San to make a check
two。 Common types of indexes (application level)
Primary key index
Create table User (`name` varchar (50) not null, `uid` int (4) not null, `gender` int (2) not null, primary key (`uid`))
The primary key index is unique, usually with the ID setting of the table as the primary key index, and a table can only have one primary key index, which is the difference between it and the unique index.
Unique index
Create table User (`name` varchar (50) not null, `uid` int (4) not null, `gender` int (2) not null, unique key (`name`))
Unique indexes are mainly used for business unique constraints. The difference between unique indexes and primary key indexes is that a table can have multiple unique indexes.
Single column index
Create table User (`name` varchar (50) not null, `uid` int (4) not null, `gender` int (2) not null, key (`name`))
Index a field
Joint index
Create table User (`name` varchar (50) not null, `uid` int (4) not null, `gender` int (2) not null, key (`name`, `uid`))
Two or more fields are combined to form an index. When using, you need to pay attention to meet the leftmost matching principle!
There are other things that are not commonly used and will not be introduced.
3. Clustered index and unclustered index
What is a clustered index?
A clustered index means that its index and row data are stored together. In other words, what is stored on the leaf node of a B+ tree is not only its index value, but also the data of a corresponding row. Just look at the picture later.
Clustered index is not an index, but a way to organize data storage!
Crreate table test (col1 int not null, col2 int not null, PRIMARY KEY (col1), KEY (col2))
As shown above, the table test has two indexes, the primary key col1 and the normal index col2. So what does these two indexes have to do with clustering and non-clustering?
A clustered index and a non-clustered index (secondary index) are generated, that is, two index trees are organized. The primary key index generates a tree of clustered indexes and a tree of non-clustered indexes with col2 as the index.
InnoDb will implement a clustered index through a primary key, or select a unique non-empty index if there is no primary key. If there is no unique non-empty index, a primary key is implicitly generated.
Let's take a look at how clustered and non-clustered indexes distribute data on the index tree. The picture is from "High performance Nysql".
The following figure shows how the clustered index data is organized. Clustered index tree with col1 as primary key index
The index column is the primary key col1
You can see that the leaf node stores the values of other columns in addition to the index value column col1 (3 "99" 4700), such as column col2 (92-8 "13), if there are other columns, or in other words, the clustered index tree stores a row of data corresponding to an index value on the leaf node.
The following figure shows the data organization of a non-clustered index (secondary index).
The index column is col2
Unlike the clustered index, the non-clustered index has only the primary key value except the index value on the index tree leaf node. The clustered index stores a row of data.
If there is a sql statement select * from test where col2=93
The above statement will go through two searches from the index tree.
1. The first step is to find the leaf node containing col2=93 from the index tree of the non-clustered index and navigate to the primary key 3 of the row
two。 The second step locates the leaf node containing the primary key = 3 in the slave cluster index according to the primary key 3 and returns all row data.
The above is based on the InnoDb storage engine, MyISAM does not support clustering index, because its data file and index file are stored independently of each other, the leaf node of the index tree of the MyISAM storage engine will not inch the primary key value, but store an address or pointer pointing to the corresponding row, and then look for it from the table data file, as shown in the following figure.
Conclusion:
Clustering Index:
Usually implemented by a primary key or a non-null unique index, the leaf node stores an entire row of data
Non-clustered index:
Also known as secondary index, is our common index, leaf nodes store index values and primary key values, according to the primary key from the cluster index
4. Overlay index
An override index means that the index contains all the fields that need to be queried.
Create table User (`name` varchar (50) not null, `uid` int (4) not null, `gender` int (2) not null, key (`uid`, `name`))
If the table User has three fields User (name,uid,gender) and a federated index key (name,uid), then
An override index is used when executing a sql query such as the one below.
Select name,uid from User where name in ('axiomagery') and uid > = 98 and uid
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.