MySQL Index principle and slow query Optimization 07/01 Update SLTechnology News&Howtos

MySQL Index principle and slow query Optimization

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

With its excellent performance, low cost and rich resources, MySQL has become the first choice of relational database for most Internet companies. Although the performance is excellent, the so-called "good horse and saddle", how to make better use of it, has become a required course for development engineers. We often see requirements such as "proficient in MySQL", "SQL statement optimization", "understanding database principles" and so on from the job description. We know that in general application systems, the read-write ratio is about 10:1, and insert operations and general update operations rarely have performance problems, and the ones that encounter the most and are the most prone to problems are some complex query operations. therefore, the optimization of query statements is obviously the top priority.

Since July 13, I have been working in Meituan's core business system department to optimize slow queries, totaling more than ten systems, and accumulating hundreds of slow query cases. As the complexity of the business increases, the problems encountered are bizarre, varied and unimaginable. The purpose of this paper is to explain the principle of database index and how to optimize slow query from the point of view of development engineer.

Thinking caused by a slow query select count (*) from task where status=2 and operator_id=20839 and operate_time > 1371169729 and operate_time, 3 and d = 4 if you set up an index in the order of (ameminbrecy c), d does not need an index, but if you build an index (a meme brecery c), you can use it all, and the order of ameme bdre d can be adjusted at will.

2. = and in can be out of order, such as a = 1 and b = 2 and c = 3. Indexes can be built in any order, and mysql's query optimizer will help you optimize it into a form that the index can recognize.

3. Try to select a highly differentiated column as the index. The formula for distinguishing degree is count (distinct col) / count (*), indicating the proportion of non-repetitive fields. The larger the proportion, the less the number of records we scan, and the differentiation degree of the only key is 1. While some status and gender fields may be 0 in front of big data, then some people may ask, is there any empirical value for this ratio? With different scenarios, this value is also difficult to determine. Generally, we need more than 0.1 for the fields that need join, that is, an average of 10 records are scanned.

4. Index columns can not participate in the calculation, keep the column "clean", for example, from_unixtime (create_time) = '2014-05-29' can not use the index, the reason is very simple, the b + tree is stored in the data table field values, but for retrieval, all elements need to be compared with the application function, obviously the cost is too high. So the statement should be written as create_time = unix_timestamp ('2014-05-29')

5. Expand the index as much as possible, do not create a new index. For example, if you already have an index of an in the table, and now you want to add the index of (aforme b), you only need to modify the original index.

Go back to the initial slow query

According to the leftmost matching principle, the index of the first sql statement should be the joint index of status, operator_id, type and operate_time; the order of status, operator_id and type can be reversed, which is why I will say that finding all the relevant queries of this table will be analyzed comprehensively.

For example, there is also the following query

Select * from task where status = 0 and type = 12 limit 10 * select count (*) from task where status = 0

Then the index status,type,operator_id,operate_time is very correct, because it can cover all cases. This makes use of the leftmost matching principle of the index.

Query optimization artifact-explain command

I believe you are no stranger to the explain command. Please refer to the official website explain-output for specific usage and field meaning. It needs to be emphasized that rows is the core indicator, and most statements with small rows must be executed quickly (there are exceptions, which will be discussed below). So optimization statements are basically optimizing rows.

Basic steps of slow query optimization

0. Run it first to see if it is really slow, and pay attention to setting SQL_NO_CACHE.

1.where condition sheet look up, lock the minimum return record table. This sentence means that the where of the query statement is applied to the table with the smallest number of records returned in the table, and each field in a single table is queried separately to see which field has the highest degree of differentiation.

2.explain checks the execution plan to see if it is consistent with 1 expectations (starting with tables that lock fewer records)

The sql statement in the form of 3.order by limit allows the sorted table to be checked first.

4. Understand the business side usage scenario

5. Refer to several principles of indexing when adding an index

6. Observation results, not in line with expectations, continue to analyze from zero

Several slow inquiry cases

The following examples explain in detail how to analyze and optimize slow queries

Complex sentence writing

In many cases, we write SQL just to achieve function, which is only the first step. Different ways of writing sentences often have essential differences in efficiency, which requires us to have a very clear understanding of the execution plan and indexing principles of mysql. See the following sentence.

Select distinct cert.emp_id from cm_log cl inner join (select emp.id as emp_id Emp_cert.id as cert_id from employee emp left join emp_certificate emp_cert on emp.id = emp_cert.emp_id where emp.is_deleted=0) cert on (cl.ref_table='Employee' and cl.ref_oid= cert.emp_id) or (cl. Ref_table='EmpCertificate' and cl.ref_oid= cert.cert_id) where cl.last_upd_date > = '2013-11-07 15 and cl.last_upd_date='2013 03and cl.last_upd_date='2013-11-07 15 and cl.last_upd_date='2013 03and cl.last_upd_date='2013-11-07 15 and cl.last_upd_date='2013- Oei.node_right = 2875 and oei.node_right = 2875 and oei.node_right

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.