In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces what are the Mysql index failure scenarios, the article is very detailed, has a certain reference value, interested friends must read it!
Database and index preparation
Create a table structure
To verify the usage of the index item by item, we first prepare a table t_user:
CREATE TABLE `tuser` (`id`int (11) unsigned NOT NULL AUTO_INCREMENT COMMENT 'ID', `id_ no` varchar (18) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL COMMENT' ID number', `username` varchar (32) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL COMMENT 'username', `age`int (11) DEFAULT NULL COMMENT 'age', `create_ time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT 'creation time', PRIMARY KEY (`id`), KEY `union_ idx` (`username`, `age`) KEY `create_time_ idx` (`create_ time`) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
There are three indexes in the above table structure:
Id: is the database primary key
Union_idx: a joint index for id_no, username, and age
Create_time_idx: is a general index made up of create_time
Initialization data
Initialization data is divided into two parts: basic data and bulk import data.
Basic data insert contains 4 pieces of data, of which the fourth data is created in the future, which is used for verification of subsequent special scenarios:
INSERT INTO `t _ user` (`id`, `create_ no`, `username`, `age`, `create_ time`) VALUES (null, '1001mm,' Tom1', 11, '2022-02-27 09create_ 04create_ 23'); INSERT INTO `tuser` (`id`, `id_ no`, `username`, `age`, `create_ time`) VALUES (null,' 1002mm, 'Tom2', 12,' 2022-02-26 09purr 04RS 23') INSERT INTO `tuser` (`id`, `create_ no`, `username`, `age`, `create_ time`) VALUES (null, '1003mm,' Tom3', 13, '2022-02-25 09create_ 04create_ 23'); INSERT INTO `tuser` (`id`, `id_ no`, `username`, `age`, `create_ time`) VALUES (null,' 1004bike, 'Tom4', 14,' 2023-02-2509pur04Rose 23')
In addition to the basic data, there is also a stored procedure and its called SQL to facilitate batch insertion of data to verify scenarios with a large number of data:
-- delete the historical stored procedure DROP PROCEDURE IF EXISTS `insert_t_ user`-- create the stored procedure delimiter $CREATE PROCEDURE insert_t_user (IN limit_num int) BEGIN DECLARE i INT DEFAULT 10; DECLARE id_no varchar (18); DECLARE username varchar (32); DECLARE age TINYINT DEFAULT 1; WHILE I
< limit_num DO SET id_no = CONCAT("NO", i); SET username = CONCAT("Tom",i); SET age = FLOOR(10 + RAND()*2); INSERT INTO `t_user` VALUES (NULL, id_no, username, age, NOW()); SET i = i + 1; END WHILE;END $-- 调用存储过程call insert_t_user(100); 关于存储过程的创建和存储,可暂时不执行,当用到时再执行。 数据库版本及执行计划 查看当前数据库的版本: select version();8.0.18 上述为本人测试的数据库版本:8.0.18。当然,以下的所有示例,大家可在其他版本进行执行验证。 查看SQL语句执行计划,一般我们都采用explain关键字,通过执行结果来判断索引使用情况。 执行示例: explain select * from t_user where id = 1; 执行结果: 可以看到上述SQL语句使用了主键索引(PRIMARY),key_len为4; 其中key_len的含义为:表示索引使用的字节数,根据这个值可以判断索引的使用情况,特别是在组合索引的时候,判断该索引有多少部分被使用到非常重要。 做好以上数据及知识的准备,下面就开始讲解具体索引失效的实例了。 1 联合索引不满足最左匹配原则 联合索引遵从最左匹配原则,顾名思义,在联合索引中,最左侧的字段优先匹配。因此,在创建联合索引时,where子句中使用最频繁的字段放在组合索引的最左侧。 而在查询时,要想让查询条件走索引,则需满足:最左边的字段要出现在查询条件中。 实例中,union_idx联合索引组成: KEY `union_idx` (`id_no`,`username`,`age`) 最左边的字段为id_no,一般情况下,只要保证id_no出现在查询条件中,则会走该联合索引。 示例一: explain select * from t_user where id_no = '1002'; explain结果: 通过explain执行结果可以看出,上述SQL语句走了union_idx这条索引。 这里再普及一下key_len的计算: id_no 类型为varchar(18),字符集为utf8mb4_bin,也就是使用4个字节来表示一个完整的UTF-8。此时,key_len = 18* 4 = 72; 由于该字段类型varchar为变长数据类型,需要再额外添加2个字节。此时,key_len = 72 + 2 = 74; 由于该字段运行为NULL(default NULL),需要再添加1个字节。此时,key_len = 74 + 1 = 75; 上面演示了key_len一种情况的计算过程,后续不再进行逐一推演,知道基本组成和原理即可,更多情况大家可自行查看。 示例二: explain select * from t_user where id_no = '1002' and username = 'Tom2'; explain结果: 很显然,依旧走了union_idx索引,根据上面key_len的分析,大胆猜测,在使用索引时,不仅使用了id_no列,还使用了username列。 示例三: explain select * from t_user where id_no = '1002' and age = 12; explain结果:The union_idx index is taken, but as in the example, only the id_no column is used.
Of course, if there are three columns in the query condition, there are no more examples. The above are all positive examples of walking the index, that is, examples that satisfy the leftmost matching principle. Let's take a look at the reverse examples that do not meet this principle.
Reverse example:
Explain select * from t_user where username = 'Tom2' and age = 12
Explain results:
At this point, you can see that no indexes have been taken, that is, the index is invalid.
Similarly, the index is invalid as long as there is no combination of leftmost conditions:
Explain select * from t_user where age = 12 * explain select * from t_user where username = 'Tom2'
Then, the first scenario of index failure is that in the scenario of federated index, the query condition does not satisfy the leftmost matching principle.
2 select is used *
There is a mandatory specification in the ORM mapping section of the Alibaba Development Manual:
[mandatory] in a table query, do not use * as the list of fields for the query, and which fields are required must be clearly specified. Description: 1) increase the parsing cost of the query analyzer. 2) the addition and subtraction fields are easily inconsistent with the resultMap configuration. 3) useless fields increase network consumption, especially fields of text type.
Although indexing issues are not mentioned in the specification manual, a possible side benefit of banning the use of select * statements is that indexes can be overridden in some cases.
For example, in the above federated index, if the query condition is age or username, when select * is used, the index will not be taken.
However, if you want to query the three results of id_no, username and age according to username (all index fields), you can overwrite the index if you specify the query result field:
Explain select id_no, username, age from t_user where username = 'Tom2';explain select id_no, username, age from t_user where age = 12
Explain results:
No matter whether the query condition is username or age, the index is gone, and according to key_len, you can see all the columns that use the index.
The second index failure scenario: under the federated index, try to use explicit query columns to tend to cover the index.
This case of not moving the index is an optimization item, and if the business scenario is satisfied, it will prompt the SQL statement to move the index. As for the specification in Alibaba's development manual, it is just a collision between the two, and the specification itself is not determined for this index rule.
3 Index columns participate in the operation
Let's look directly at the example:
Explain select * from t_user where id + 1 = 2
Explain results:
As you can see, even if the id column has an index, it cannot walk through the index normally because of the calculation.
In view of this situation, it is not only the problem of index, but also increases the computational burden of the database. Take the above SQL statement as an example, the database needs to scan out all the id field values in the whole table, then calculate them, and then compare them with the parameter values. If you go through the above steps for each execution, you can imagine the performance loss.
The recommended way to use is to calculate the expected value in memory first, or to calculate the parameter value to the right of the condition of the SQL statement.
The optimizations for the above example are as follows:
-- memory calculation. Learn that the id to be queried is 1explain select * from t_user where id = 1;-- the parameter side calculates explain select * from t_user where id = 2-1
The third kind of index failure: the index column participates in the operation, which will lead to full table scan and index failure.
4 Index column parameters use functions
Example:
Explain select * from t_user where SUBSTR (id_no,1,3) = '100'
Explain results:
In the above example, the index column uses a function (SUBSTR, string interception), causing the index to fail.
At this time, the reason for index failure is the same as in the third case, because the database has to scan the full table first, and then intercept and calculate the data, resulting in index failure. At the same time, there are performance problems.
The example only enumerates the SUBSTR function, such as CONCAT and other similar functions, and a similar situation will occur. The solution can refer to the third scenario, and consider first reducing the database through in-memory computing or other ways to process the content.
The fourth kind of index failure: the index column participates in the function processing, which will lead to full table scan and index failure.
5 incorrect use of Like
Example:
Explain select * from t_user where id_no like'%'
Explain results:
The use of like is very frequent, but improper use often leads to indexing. Common ways to use like are:
Method 1: like'% abc'
Method 2: like 'abc%'
Method 3: like'% abc%'
Among them, mode one and mode three cannot walk the index because the placeholder appears in the first part. The reason for not indexing is easy to understand. The index itself is the equivalent of a directory, sorted one by one from left to right. The placeholder is used on the left side of the condition, which makes it impossible to match according to the normal directory, and it is normal for the index to fail.
The fifth index failure case: when fuzzy query (like statement), the placeholder of fuzzy matching is located at the beginning of the condition.
6 types implicit conversion
Example:
Explain select * from t_user where id_no = 1002
Explain results:
The id_no field type is varchar, but the int type is used in the SQL statement, resulting in a full table scan.
The reason for index failure is that varchar and int are two different types.
The solution is to enclose parameter 1002 in single or double quotes.
The sixth kind of index failure: the parameter type does not match the field type, resulting in implicit conversion of the type and index failure.
There is a special case in which if the field type is int and the query condition adds single or double quotation marks, Mysql converts the parameter to int type, although single or double quotation marks are used:
Explain select * from t_user where id ='2'
The above statement will still go to the index.
7. Use OR to operate
OR is the most frequently used operation keyword every day, but improper use can also lead to index invalidation.
Example:
Explain select * from t_user where id = 2 or username = 'Tom2'
Explain results:
Is it surprising to see the above execution results? it is clear that the id field has an index, but the index is invalid due to the use of the or keyword.
In fact, from another point of view, if you use the username field alone as a condition, it is obviously a full table scan, and now that you have already scanned the whole table, it would be a waste to index the previous id condition again. Therefore, when using the or keyword, keep in mind that indexes should be added for both conditions, otherwise the index will be invalidated.
But if both sides of the or use ">" and "and"
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.