How to create an efficient and appropriate index in MySQL 07/12 Update SLTechnology News&Howtos

How to create an efficient and appropriate index in MySQL

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces how to create an efficient and appropriate index in MySQL, which has a certain reference value, and interested friends can refer to it. I hope you can learn a lot after reading this article.

1. When querying with index columns, try not to use expressions and put the evaluation at the business tier rather than the database tier.

In the following figure, the results of the two sql are the same, but the execution plans of the two sql are different. The efficiency of index in type is much lower than that in const where conditions. The actor_id+4 expression affects the execution plan. For the meaning of type, please refer to explain for a detailed explanation.

two。 Try to use primary key queries instead of other indexes, and primary key queries do not appear back to table queries.

All our tables basically have primary keys, so indexes are usually used when indexes can be used in development, and primary key indexes are used if primary key indexes can be used.

3. Use prefix indexing

In many cases, our index is actually a string, there will inevitably be a long string, which will cause the index to take up too much and reduce its efficiency. Especially for long columns like blob,text and varchar. Instead of using the full value of the field as the index, you can take only the first half of the field (the selectivity of the selected prefix index is close to that of the entire column). This can greatly reduce the index space and improve the efficiency, but the disadvantage is that the selectivity of the index is reduced.

Index selectivity: the ratio of non-duplicate index values to the total number of records in the data table (# T), ranging from 1 to 1. The higher the selectivity of the index, the higher the query efficiency, because the data is highly differentiated and more rows can be filtered out. The selectivity of the unique index is 1, and its performance is the best.

For example, the company's staff table mailbox field, a company's mailbox suffix is the same as xxxx@qq.com, in fact, use the mailbox as the index effective on the xxxx part, because @ qq.com is the same, the index is meaningless, obviously only use xxxx as the index, its selectivity and the whole value is the same, but xxxx as an index will obviously reduce the index space.

Below we have the employee table as an example (see the table structure and data at the end of the article)

Let's take the email field as an example:

The mailbox of this data is actually the mobile phone number + @ qq.com as an example, in fact, the first 11 digits are all the same. I use the following sql to see the selectivity of these data (taking the first 10 Magi 11 and 12 bits respectively) to calculate.

When there are 11 prefixes, the selectivity is 1, and the selectivity does not change when increasing the field length: select count (distinctleft (email,10)) / count (*) as e10, count (distinctleft (email,11)) / count (*) as e11, count (distinctleft (email,12)) / count (*) as e12 from employee

From the figure above, we can see that the selectivity of the first 10, the first 11, and the first 12 is 0.14, respectively. 1.0, 1.0, and the index selectivity is the highest at the 11th position, so it is not necessary to use all as indexes, increasing the space of the index.

-- create prefix index alter table employee add key (email (11))

We can also calculate the frequency using count (the less the number of occurrences, the lower the repetition rate, the greater the selectivity)

-- find the frequency of prefixes select count (*) as cnt,left (email,11) as pref from employee group by pref order by cnt desc limit 10

4. Use index scanning to sort

We often have the need for sorting, using order by but order by affects performance. It sorts data by loading it into memory. If a large amount of data cannot be stored in it, it can only be processed multiple times. But the index itself is ordered, and it is easier to sort directly through the index.

Scanning the index itself is fast because you only need to move from one index record to the next, but if the index cannot cover all the columns needed by the query, you have to go back to the table and query the corresponding rows for each index record scanned, which is basically random IO. As a result, reading data in indexed order is usually slower than sequential full table scans.

Mysql can use the same index to both satisfy sorting and find rows. Consider setting up such an index if you can.

Mysql can use the index to sort the results only if the order of the index columns is exactly the same as the order of the order by clause, and all columns are sorted in the same direction (flashback or positive order). If the query needs to associate multiple tables, index sorting can be used only if the fields of the order by clause are all the first table. Order by queries also need to satisfy the leftmost prefix of the composite index, otherwise index sorting cannot be used.

In fact, there are two main points to pay attention to in development:

The fields in the where condition and the fields in the order by can be combined indexes and satisfy the leftmost prefix.

The order of the fields in order by needs to be consistent, there can be no desc, and there can be asc.

5. Both union all and in,or can use indexes, but in is recommended.

As shown above, union all will be executed twice, while in and or will only be executed once. At the same time, we can see that the execution plans of or and in are the same.

But we're taking a look at their execution time. In the following figure, you can see the detailed time using set profiling=1 and the specific time using show profiles. The following figure shows that the time of or is 0.00612000 and the time of or is 0.00022800, and the gap is still very large (the table data tested is only 200rows).

Union all: the query is divided into two phases. In fact, there is also a union. It is recommended to use union all when you must use union in your development, because there are more steps in union to remove duplicates from distinct. So try to use union all.

6. An index can be used for range columns

Conditions of the range: >, > =

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.