What is the principle of indexing in MySQL? 07/06 Update SLTechnology News&Howtos

What is the principle of indexing in MySQL?

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you about the principle of indexing in MySQL. The article is rich in content and analyzed and described from a professional point of view. I hope you can get something after reading this article.

Index purpose

The purpose of indexing is to improve query efficiency, which can be compared to dictionaries. If we want to look up the word "mysql", we definitely need to locate the m letter, then find the y letter from the bottom down, and find the rest of the sql. If there is no index, then you may need to look through all the words to find what you want. What if I want to find the word that starts with m? Or what about the words that begin with ze? Do you think this can't be done without an index?

Index principle

In addition to dictionaries, examples of indexes can be found everywhere in life, such as the train schedule of the railway station, the catalogue of books, and so on. Their principle is the same, by constantly narrowing the scope of the data we want to filter out the final results, while turning random events into sequential events, that is, we always lock the data in the same way.

The database is the same, but it is obviously much more complicated, because there is not only an equivalent query, but also a range query (>, 3 and d = 4). If you build an index in the order of (a), d can not use an index, if you build an index (a), the order of a can be adjusted at will. = and in can be out of order, for example, a = 1 and b = 2 and c = 3 indexes can be built in any order, and mysql's query optimizer will help you optimize it into a form that the index can recognize. Try to select a highly differentiated column as the index. The formula for distinguishing degree is count (distinct col) / count (*), indicating the proportion of non-repetitive fields. The larger the proportion, the less the number of records we scan, and the differentiation degree of the only key is 1. While some status and gender fields may be 0 in front of big data, then some people may ask, is there any empirical value for this ratio? It is difficult to determine this value for different scenarios. Generally, we need more than 0.1 for the fields that require join, that is, an average of 10 records are scanned. Index columns can not participate in the calculation, keep the column "clean", for example, from_unixtime (create_time) = '2014-05-29' can not use the index, the reason is very simple, the b + tree is stored in the data table field values, but for retrieval, all elements need to be compared with the application function, obviously the cost is too high. So the statement should be written as create_time = unix_timestamp ('2014-05-29'). Expand the index as much as possible, do not create a new index. For example, if you already have an index of an in the table, and now you want to add the index of (a), you only need to modify the original index. Go back to the initial slow query

According to the leftmost matching principle, the index of the initial sql statement should be the joint index of status, operator_id, type and operate_time; the order of status, operator_id and type can be reversed, which is why I would say that all relevant queries for this table will be found and analyzed comprehensively; for example, the following query:

Select * from task where status = 0 and type = 12 limit 10 * select count (*) from task where status = 0

Then the index status,type,operator_id,operate_time is very correct, because it can cover all cases. This makes use of the leftmost matching principle of the index.

Query optimization artifact-explain command

I believe you are no stranger to the explain command. Please refer to the official website explain-output for specific usage and field meaning. It needs to be emphasized that rows is the core indicator, and most statements with small rows must be executed quickly (there are exceptions, which will be discussed below). So optimization statements are basically optimizing rows.

The basic steps of slow query optimization first run to see if it is really slow. Pay attention to setting the SQL_NO_CACHEwhere conditional single table lookup and locking the minimum return record table. This sentence means that the where of the query statement is applied to the table with the smallest number of records returned in the table. Start to query each field in a single table to see which field has the highest degree of differentiation explain to view the execution plan. Whether it is consistent with the expectation of 1 (starting with locking tables with fewer records) the sql statement in the form of order by limit gives priority to the sorted table to understand the business side's use of scenarios and indexes with reference to several major principles of indexing to observe the results, which is not in line with the expectation to continue to analyze several slow query cases from 0.

The following examples explain in detail how to analyze and optimize slow queries.

Complex sentence writing

In many cases, we write SQL only to achieve function, which is only the first step. Different sentence writing methods often have essential differences in efficiency, which requires us to have a very clear understanding of the execution plan and indexing principles of mysql. Please see the following sentence:

Select distinct cert.emp_id from cm_log cl inner join (select emp.id as emp_id Emp_cert.id as cert_id from employee emp left join emp_certificate emp_cert on emp.id = emp_cert.emp_id where emp.is_deleted=0) cert on (cl.ref_table='Employee' and cl.ref_oid= cert.emp_id) or (cl.ref_table='EmpCertificate' and cl.ref_oid= cert.cert_id) where cl.last_upd_date > = '2013-11-07 15 and cl.last_upd_date= 03and cl.last_upd_date= 00' and cl.last_upd_date='2013-11-07 15 and cl.last_upd_date= 0315 and cl.last_upd_date= 2875 and&llt / span > oei.node_right = 2875and oei.node_right

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.