What are the rules for creating string indexes in MySQL 04/26 Update SLTechnology News&Howtos

What are the rules for creating string indexes in MySQL

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

What are the rules for creating string indexes in MySQL? Many novices are not very clear about this. In order to help you solve this problem, the following editor will explain it in detail. People with this need can come and learn. I hope you can gain something.

How to better create a string index

We know that in MySQL, both data and index are on a B+ tree. When we build an index, the less space the tree takes up, the faster the retrieval speed will be, while some strings in varchar format will be very long, so how can we build a more reasonable index of strings in today's efficiency?

If we have an email field in a table, now we want to create an index on the email field. The email field value is in the format: zhangsan@qq.com.

There are two ways to build an index:

1. Index the email field directly: alter table t add index index1 (email)

The structure of the index tree is:

2. Set up the prefix index of email: alter table t add index index2 (email (6))

The index data structure is:

At this point, our query statement is: select id,name,email from t where email='zhangsh223@xxx.com'

When using an index1 index, it performs the following steps:

1. Find the primary key value ID1 whose index value is zhangsh223@xxx.com from the index1 index tree.

2. According to the ID1 table, check that the data in this row is indeed zhangsh223@xxx.com, and add the result to the result set.

3. Continue to find out whether the next index value of the index1 index tree satisfies zhangsh223@xxx.com. If not, the query ends.

When using an index2 index, it performs the following steps:

1. Find the primary key value ID1 whose index value is zhangs from the index2 index tree.

2. According to the ID1 table, check that the data in this row is indeed zhangsh223@xxx.com, and add the result to the result set.

3. Continue to find out whether the next index value of the index2 index tree satisfies zhangs. If so, continue to return to the table to query whether the data is zhangsh223@xxx.com. If not, skip to continue searching.

4. Keep searching the index2 index tree until the index value is not zhangs.

From the above analysis, we can see that the full-field index reduces the number of table returns compared with the prefix index, but if we increase the number of prefixes from 6 to 7 and 8, the number of prefix indexes returning to the table will be reduced, that is to say, as long as the length of the prefix is defined, we can save space and ensure efficiency.

So the question is, how do we measure the length of the index using the prefix?

1. Use select count (distinct email) as L from t; to query the number of different values of fields.

2. Select different prefix lengths in turn to view the number of different values:

Select count (distinct left (email,4)) as L4, count (distinct left (email,5)) as L5, count (distinct left (email,6)) as L6, count (distinct left (email,7)) as L7 from t

Then, according to the actual acceptable loss ratio, the shortest prefix length is selected.

We solved the problem of prefix length, but one problem is that if we use prefix indexes, we don't need the feature of index coverage.

When using a full-field index, when we query select id,email from t where email='zhangsh223@xxx.com';, we can find the id and email fields without going back to the table.

However, when indexing with a prefix, MySQL does not know whether the prefix will completely cover the value of email, regardless of whether it is fully included or not, it will query back to the table based on the primary key value.

Therefore, although the use of prefix index can save space to ensure efficiency, it can not use the feature of overwriting index, whether to use it or not depends on the specific consideration.

Other string index creation methods

Considering the actual situation, not all strings can be indexed by prefix interception, such as ID card number or ip string using prefix index is unreasonable, ID number is generally the same as the first few digits of people in the same region, it is unreasonable to use prefix index, and ip values are generally converted into numbers to store in practice.

For the ID number, we can use flashback storage, create an index with a prefix, or use the crc32 () function to get a hash check code (int value) as the index.

Flashback: select field_list from t where id_card = reverse ('input_id_card_string')

Crc32:select field_list from t where id_card_crc=crc32 ('input_id_card_string') and id_card='input_id_card_string'

These two methods are relatively similar in efficiency, do not support range search, but support equivalent search.

In flashback mode, you need to use the reverse function, but the table may be returned more times than in hash mode.

In hash mode, you need to create a new index field and call the crc32 () function. (note: the result obtained by the crc32 () function is not guaranteed to be unique, and there may be repetition, but the probability of this case is small.) the number of returns to the table is less, almost once.

Last

Generally speaking, there are several ways to create a string index:

1. The string is short and the whole field is indexed directly.

2. The string is long and the prefix differentiation is good. Create a prefix index.

3. The string is long, the prefix differentiation is not good, and the index is created by flashback or hash (this way range query is not good)

4, according to the actual situation, encounter a special string, special treatment, such as ip.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.