Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand MySQL Index cardinalit

2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article focuses on "how to understand MySQL Index cardinalit". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "how to understand MySQL Index cardinalit"!

View the index of a table:

Mysql > show index from rank_item +- -+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +- -+- -+-+ | rank_item | 0 | PRIMARY | 1 | id | A | 5665508 | NULL | NULL | BTREE | | rank_item | 1 | idx_city_category | 1 | city | A | | 2713 | NULL | NULL | | BTREE | rank_item | 1 | idx_city_category | 2 | category | A | 3798 | NULL | NULL | YES | BTREE | | rank_item | 1 | idx_artisan_id | | 1 | artisan_id | A | 33916 | NULL | NULL | YES | BTREE | | rank_item | 1 | index_weight | 1 | weight | A | 11680 | NULL | NULL | YES | BTREE | | rank_item | 1 | | product_id_plan_id | 1 | product_id | A | 1480432 | NULL | NULL | | BTREE | | rank_item | 1 | product_id_plan_id | 2 | plan_id | A | 5590288 | NULL | NULL | YES | BTREE | | | rank_item | 1 | idx_cat_ci_art | 1 | category | A | 3170 | NULL | NULL | YES | BTREE | | rank_item | 1 | idx_cat_ci_art | 2 | city | A | 11417 | NULL | NULL | | | BTREE | | rank_item | 1 | idx_cat_ci_art | 3 | artisan_id | A | 46514 | NULL | NULL | YES | BTREE | rank_item | 1 | idx_ca_ci_pid_wei | 1 | category | A | | | 3187 | NULL | NULL | YES | BTREE | rank_item | 1 | idx_ca_ci_pid_wei | 2 | city | A | 10869 | NULL | NULL | | BTREE | | rank_item | 1 | idx_ca_ | Ci_pid_wei | 3 | plan_id | A | 17403 | NULL | NULL | YES | BTREE | | rank_item | 1 | idx_ca_ci_pid_wei | 4 | weight | A | 659306 | NULL | NULL | YES | BTREE | | +- -- +

There is an attribute, Cardinality, that you can observe to evaluate whether the index is reasonable. It estimates that there are no duplicate records in the index, and if the relative value is small, it may be necessary to evaluate whether the index is meaningful.

View the total number of rows in the table:

Mysql > select count (*) as total from rank_item;+-+ | total | +-+ | 5581872 | +-+

Observe the following information:

Id column: Cardinality/total=5608506/5581872=1.005

City column: Cardinality/total=2713/5581872=0.0000486

Category column: Cardinality/total=3170/5581872=0.0000568

Because the column id is the primary key, the estimated value / total through cardinality is close to 1, while for the other two index columns, the estimated value / total is close to 0. The estimated value / total = proportion, which we call the relative value.

To make a wild guess from the table above, querying the id column is fast, while querying the other two columns is slow; now let's take a look at the corresponding execution plan.

Mysql > explain select * from rank_item where id=2419 +- -+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | + -+ | 1 | SIMPLE | rank_item | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL | +- -+ 1 row in set 1 warning (0.00 sec) mysql > explain select * from rank_item where city=4967 +- -+-+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +- -+ | 1 | SIMPLE | rank_item | NULL | ref | idx_city_category | idx_city_category | 4 | const | 556680 | 100.00 | | NULL | +-- +-- | +-+-+ 1 row in set 1 warning (0.04 sec)

But all the discoveries are indexed, and the ref is all const. Is it because cardinality is not allowed? Yes, because it is an estimate!

How does cardinality estimate?

It is mentioned above that cardinality is the estimate of non-duplicate records in the index, so how is it implemented? Since Mysql's B+ index is implemented differently in each storage engine, cardinality is simply implemented at the storage engine level!

For innodb, the cardinality is recalculated when the following 2 points are reached

If the data of 1max 16 in the table changes

If stat_modified_counter > 200000 0000

Why is that? Because in the real environment, index updates may be very frequent, such as inserting, updating, deleting data in a table, it will be a great burden to count cardinality every time; in addition, if it is a large table, it may be very time-consuming. Based on this, the "sampling" statistics based on the above two conditions are adopted.

What's the difference between the two above?

If the data in the table changes, it will be updated; the second case is special. If a thousand data is updated frequently, but the data has not increased, the first method is not applicable, so stat_modified_counter is set to the number of changes; if the number of times reaches 200000 0000, the statistics will also be updated.

How exactly is it sampled and counted?

Get the data of the leaf node of the B+ tree and mark it as A

Eight leaf nodes in the B+ tree index are randomly obtained. Count the number of different records on each page, which is recorded as P1Magne P2.P8 respectively.

Calculate cardinality = (P1+P2+...P8) A Band 8

As a result, the number of different records in the index is obtained. From above, we can see that there are two problems.

1, because it is random sampling, so it will appear, two consecutive statistics, the number is different. The statistical values will be the same only when the table data is very small and there are no more than 8 leaf nodes, and each sample is fetched to the same page.

2. Because the statistical values are updated based on the above two conditions, it may appear that after the system has been running for a period of time, the data has changed greatly, and the statistical deviation is relatively large, then the efficiency of the index will decrease.

What should I do with question 2?

Update statistics manually

If the system has been running for a while, we can recalculate the cardinality value by executing the following sql.

Analyze table tablename

However, if the table is large, re-counting can be very time-consuming. For core tables, it is recommended to operate during off-peak hours.

Selectivity

Now back to the previous example, by observing the execution plan, we found that no matter the cardinality size, the relative value size, the discovery will still go the index, so why not recommend indexing for those with very small relative values? This involves a question of selectivity.

For example, there is a user table with a column of gender sex, and now you want to query all users with gender male (assuming only men-male, women-female, and no other unknown genders), possible sql:

Select * from user where sex ='M'

For this sql, although there is an index on the sex, it may read more than half of the data when it is executed, and even in extreme cases (such as the programmer's website), most of the data needs to be read, so it still takes a full table scan, which is called low selectivity. On the other hand, if it is highly selective, it is recommended to build an index, such as users in the user table, which are rarely repeated.

At this point, I believe you have a deeper understanding of "how to understand MySQL Index cardinalit". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report