How to use mysql Index efficiently 04/21 Update SLTechnology News&Howtos

How to use mysql Index efficiently

2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces how to use the mysql index efficiently, which has certain reference value. Interested friends can refer to it. I hope you can learn a lot after reading this article. Let's take a look at it.

Preface

Mysql believes that most people have used it, and indexes must have been used, but do you know how to create the right index? When the amount of data is small, inappropriate indexes will not have much impact on performance, but when the data increases gradually, the performance will decline sharply.

Index basis

We all know that the step of looking up a dictionary is to find the page number of the word in the index page, and then check the information of the word in the corresponding page number. Mysql's index method is similar to this, first finding the corresponding value in the index, and then finding the corresponding data row based on the matching index record. If you have the following sql statement:

Select * from student where code='2333'

Add an index on the code column that mysql will use to find the data row with a value of '2333', and then read all the data returned from the data row.

Index type

B-Tree index

(either B-minus tree or B-tree), the vast majority of index types are B-Tree (or variants of B-Tree), and we usually use the same type of index. The MyISAM storage engine in Mysql uses Baidu, which is the difference between B-tree and B + tree.

The index of the tree structure can speed up the speed of accessing the data, and the storage engine no longer needs a full table scan to obtain the required data, but instead carries on the binary search from the root node of the tree, and the speed of the total weekly binary search is quite fast. therefore, we can use the index to greatly improve the query speed. B-Tree supports the following types of queries:

Assuming that there is only one multi-column index in the student table: name,age,weight, the following query can take advantage of this index

Full value matching

Matches all columns in the index column. For example, query name='abc' and age=12, where the first column and the second column are used

Match the leftmost front column

Only use the beginning of the index. For example, query name='ggg' uses only the first column of the index, while query name='ggg' and age=12 uses the first and second columns of the index.

Match column prefix

You can also match only the beginning of a column, such as querying name lik'g% records, and querying records whose name starts with g. The first column is used here.

Match range valu

Can be used to match range values, such as querying name > 'abc' and name

< 'bcd' 精确匹配某一列并范围匹配另外一列用于匹配多列，比如查询name='abc' and age >

twelve.

Overall, you can find that the B-Tree index is suitable for lookups based on the leftmost prefix, that is, the query fields are in the same order as the index fields, starting with the first index field. For example, indexes can be used in the query name,name and age,name and age and weight, but not in the query age,age and name.

Hash indexing

The hash index is implemented based on the hash table, and will not take effect until all columns of the index are exactly matched. In MySQL, only the Memory engine shows support for hash indexes, which is also its default index.

InnoDB cannot create a hash index, but it has a feature called adaptive hash index. when some index values are used very frequently, the engine creates a hash index based on the B-Tree index in memory, which gives the B-Tree index a little bit of the advantage of the hash index. This feature is a completely automatic, internal behavior, that is, it cannot be controlled or configured manually.

High performance indexing strategy

Here are some common indexing strategies.

Independent column

This is simple. If the columns in the query are not independent, you cannot use the index, such as:

Select * from student where age+1=12

Even if the age column has an index, the above query statement cannot take advantage of the index.

Prefix index and index selectivity

If you need to index long string columns, creating the index directly will make the index take up more space and be slower. One optimization strategy is to simulate the hash index: calculate a hash value for the column and index the hash value column.

Another way is to build a prefix index. Index only some of the characters at the beginning of this field, which can greatly solve the space consumption, and the index establishment speed will be much faster. But this also has the following disadvantages:

Reduce the index selectivity, if multiple strings with the same prefix can not be distinguished, but also need to make a string comparison.

Order by,group by is not supported for obvious reasons. Only part of the characters are indexed and cannot be completely distinguished.

The key here is to determine how many characters are appropriate for the index. It is necessary not only to avoid excessive length, but also to have sufficient index selectivity. There are two ways to help determine the number of index characters:

The index field prefix data is evenly distributed. That is, the number of strings starting with index characters is evenly distributed, such as the first three characters of the index name field. The following result is reasonable (only the top 8 characters are selected):

Number index the first three characters 500abc465asd455acd431zaf430aaa420vvv411asv512pdf

If the data of each column is large, it means that the discrimination is not high and the number of index characters needs to be increased until the selectivity of this prefix is close to the indexability of the complete column, that is, the previous data should be as small as possible.

Calculate the selectivity of the complete column and make the selectivity of the prefix close to the selectivity of the complete column. The user calculates the full column selectivity in the following statement:

The number / total number of different strings is the full column selective select count (distinct name) / count (*) from person.

The following statement calculates the selectivity of the first three fields of the index:

-- string data with different first three characters / total data select count (distincy left (city,3)) / count (*) from person

Keep increasing the number of index characters until the selectivity is close to the full column selectivity and continues to increase the data selectivity by a small margin.

Creation method

-- assume the best length is 4alter table person add key (name (4))

Multi-column index

Many people have the misconception that if a query uses multiple fields' and' query, wouldn't indexing each field maximize efficiency? This is not the case, and mysql will only select one of the fields for indexing. In this case, you should establish a multi-column index (also known as a federated index) so that you can take advantage of multiple index fields, and note that the order of the index columns is the same as the order of the query.

The "index merge" strategy was introduced in version 5.0 and above. To some extent, you can also use multiple single-column indexes, such as the following query:

-- mysql will use name and age indexes to find the data and then merge-- if and, find the data and then compare the intersection select * from person where name = "bob" or age=12

However, this is not recommended because too many and or or conditions can cost a lot of CPU and memory algorithm caching, sorting, and merging operations.

Select the appropriate index column order

In a multi-column B-Tree index, the order of the index columns means that the index is sorted first by the leftmost column, and then by the second column. Index A good multi-column index should put the index with the highest selectivity at the front, and then decrease it in turn, so as to better benefit the index. See the prefix indexing section on the optional calculation side.

Clustering index

Clustering index is not a separate type of index, but a method of data storage, and the details depend on its implementation.

InnoDB's clustered index actually holds index values and data rows in the same structure. Because data rows cannot be placed in two different places at the same time, a table can have only one clustered index. The clustered index of InnoDB is listed as the Primary key column.

If no primary key is defined, InnoDB selects a unique non-empty index instead. If there is no such index, InnoDB implicitly defines a primary key as the clustered index.

The main advantage of clustering index is that it can save the relevant data together, reduce the disk IO and improve the query efficiency. But there are also drawbacks:

The insertion order is heavily dependent on the insertion order. Inserting in the order of primary keys is the fastest way, otherwise it may cause the problem of page splitting, which will take up more disk space and the scanning speed will be slower. Tables can be reorganized through OPTIMIZE TABLE.

Updating the clustered index column is expensive because the index value changes and the row data moves to a new location with the index.

The secondary index (non-clustered index) needs two index lookups to access the row data, because the secondary index leaf node stores not the physical location of the row data, but the primary key value of the row, and then fetches the row data from the clustered index through the primary key value.

Overlay index

To put it simply, an index covers the column fields that need to be queried, so that there is no need to use the primary key for secondary search in the clustered index, and the required data can be obtained in a secondary index.

The index of InnoDB stores the index value in the leaf node, so if the fields to be queried are all contained in an index, and the index is used, it can greatly improve the query speed. For example, the following query statement:

-- if name has an index, it directly takes the name value from the leaf node of the index and returns it. There is no need to look up the select name from person where name='abc' twice. If there is an aggregate index of `name and age`, the data will also be returned directly, without the need to look up the select name,age from person where name='abc' and age=12 twice.

Sort using an index

The sorting operation of mysql can also make use of indexes, and only when the column order of the index is exactly the same as that of ORDER BY, and the sorting method of all columns (positive or reverse) is the same, the index can be used for sorting. Note: the sorted fields can be fewer than the corresponding index fields, but the order must be the same. It is as follows:

-- suppose there are: (name,age,sex) federated index-- you can use the index to sort select. Order by name desc,age descselect... Sorting select is not available for order by name desc,age desc,sex desc--. Order by name desc,sex descselect... Order by name desc,age asc thank you for reading this article carefully. I hope the article "how to make efficient use of mysql Index" shared by the editor will be helpful to everyone. At the same time, I also hope that you will support and follow the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.