How to implement slow query in MySQL 07/08 Update SLTechnology News&Howtos

How to implement slow query in MySQL

2025-07-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

MySQL how to achieve slow query, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something.

1 what is slow SQL?

This refers to MySQL slow queries, specifically SQL that run for longer than the long_query_ time value.

We often listen to common MySQL, such as binary log binlog, relay log relaylog, redo rollback log redolog, undolog and so on. For slow queries, there is also a slow query log slowlog, which is used to record statements whose response time exceeds the threshold in MySQL.

Do not be misled by the name slow query, thinking that the slow query log will only record select statements. In fact, it will also record insert, update and other DML statements whose execution time exceeds the threshold set by long_query_time.

# check whether slow SQL enables show variables like "slow_query_log%"; # check the threshold unit set by slow query: second show variables like "long_query_time"

For the AliSQL-X-Cluster we use, that is, XDB, the default slow query is on and long_query_time is set to 1 second.

2 Why does slow query cause failure?

The real slow SQL is often accompanied by a large number of line scans, temporary file sorting or frequent disk flush, which directly affects the increase of disk IO, normal SQL becomes slow SQL, and large area execution timeout.

After Singles Day last year, in view of the problems exposed on the technical side, Cainiao launched a number of special governance on the CTO-D line. CTO-D each received one as a CTO, and my big team was responsible for slow SQL governance.

Second, to what extent should it be achieved?

1 how to measure the slow SQL severity of an application?

Micro average

Sum (number of slow SQL execution of aone application)-sum (number of SQL execution of aone application)

We believe that the higher the value, the greater the impact; the smaller the value, the smaller the impact.

In extreme cases, the SQL executed in the application is all slow SQL, with a value of 1; the SQL executed in the application is not slow SQL, with a value of 0.

But the problem caused by this indicator is poor discrimination, especially when the SQL QPS is very high and in most cases the SQL is not a slow query, the occasional slow SQL will be flooded.

Another question, is the occasional slow SQL really slow SQL? We encounter a lot of SQL recorded by slow query, which may actually be affected by other slow SQL, MySQL disk jitter, optimizer choice and other reasons, which obviously turn the performance of slow SQL into slow SQL under conventional query.

Macro average

Sum (slow SQL 1 execution) sum (slow SQL n execution)-+-sum (SQL 1 execution) sum (SQL n execution)-n

This algorithm is based on the fact that the caught slow SQL has a certain number of execution times, and can reduce the impact of pseudo-slow SQL.

When the QPS of some applications is very low, that is, the number of times a day to execute SQL is very small, if it encounters false SQL, it will cause statistical error.

Number of execution

Sum (slow SQL execution times of aone applications)-7

Statistics of the average number of slow SQL executions per day in the last week can eliminate the false SQL problem caused by macro averages.

Number of slow SQL templates

All of the above dimensions have a time limit. In order to trace the slow SQL history processing, we also introduce the global slow SQL template quantity dimension.

Count (distinct (slow SQL template for aone applications))

2 goal

Core application: get rid of all slow SQL

General application: micro-average index decreased by 50%

3 CTO report

Take CTO-D as a unit to summarize the weighted average of the application according to the above multi-dimensional indicators, ranking from low to high, highlighting the head and tail top3, broadcast weekly.

Three, why should I do it?

Guess it may have something to do with my background. I have the background of Chammer Craft +. I was responsible for the design and landing of the remote multi-living architecture at the company level in the last company. I know a little bit about MySQL.

In addition, it may be that the interests are irrelevant. My small team has just started its business, and there is no slow SQL, so it can be inserted into various business lines.

Four action support

Group 1 MySQL protocol

Excerpt from the index specification:

[force] disable join for more than three tables. For fields that need join, the data types are absolutely consistent; when you associate a query with multiple tables, make sure that the associated fields need to be indexed.

Note: even dual-table join should pay attention to table index and SQL performance.

[force] when building an index on a varchar field, you must specify the index length. There is no need to index the whole field. The index length is determined according to the actual text differentiation.

Note: the length and differentiation of an index are contradictory. Generally, for an index with a length of 20, the discrimination can be as high as 90%, which can be determined by using the discrimination of count (distinct left (column name, index length)) / count (*).

[mandatory] Page search is strictly forbidden to be left fuzzy or fully fuzzy. If necessary, please go to the search engine to solve the problem.

Note: the index file has the leftmost prefix matching feature of B-Tree, so this index cannot be used if the value on the left is not determined.

[recommended] prevent implicit conversions due to different field types, resulting in index invalidation.

[reference] avoid the following extreme misunderstandings when creating an index:

1) it is better to overdo than lack the index.

Think that a query needs to build an index.

2) the creation of stingy index

It is considered that the index will consume space and seriously slow down the speed of update and addition.

3) boycott unique indexes

It is considered that all unique indexes need to be solved by "check first and insert later" in the application layer.

2 DB change standard

DDL needs to control the speed of change, pay attention to grayscale and concurrency control, and change release needs to be within the prescribed change release window.

Share some examples of my participation in optimization

1 uneven distribution of data

1) unreasonable sub-database and sub-table

The business data is divided into 8 databases, and each database is divided into 16 tables. By looking at the tablespace, we can see that almost all the data is distributed in two tables in each database. There is a problem with the strategy of sub-database and sub-table, and the business increment is overestimated, which has reservations.

2) the index is unreasonable

A joint index of idx_logistics_corp_id_special_id is created in a single table, but even so, the discrimination is still too low, which is very high according to the combination of experiment and business feedback (logistics_corp_id,transport_type_id) fields, and there is a single query scenario of transport_type_id for business.

2 Index problem

SELECT COUNT (0) AS `tmp_ count`from (SELECT `table_ holder`.`user _ id`, `table_ holder`.`sc _ item_ id`, SUM (CASE `table_ holder`.`inventory _ type`WHEN 1 THEN `table_ holder`.`quantity`ELSE 0 END) AS `saleable_ quantity` SUM (CASE `SUM holder`.`inventory _ type`WHEN 1 THEN `table_ holder`.`lock _ quantity`ELSE 0 END) AS `quantity`, SUM (CASE `table_ holder`.`inventory _ type`WHEN 401THEN `table_ holder`.`quantity`ELSE 0 END) AS `transfer_on_way_ quantity`, `table_ holder`.`store _ code` MAX (`table_ holder`.`gmt _ modified`) AS `gmt_ modified` FROM `table_ holder` WHERE (`table_ holder`.`is _ deleted` = 0) AND (`table_ holder`.`quantity` > 0) AND `table_ holder`.`user _ id`IN (3405569954) AND `table_ holder`.store _ code`IN ('ZJJHBHYTJJ0001','... more than 1000') GROUP BY `table_ holder`.`user _ id`, `table_ holder`.`sc _ item_ id` ORDER BY `table_ holder`.user _ id` ASC `table_ holder`.`sc _ item_ id`ASC) `a`

The table corresponding to this case has an store_code index, so I think there is no problem and there is no way to optimize it. In fact, by executing the plan, we found that MySQL chose a full table scan. For this case practice, it is found that when the number of range queries exceeds 200, the index optimizer will no longer use the field index.

Finally, after pulling the relevant query SQL of the most recent period of time, combined with the data distribution of the business, we find that it can be solved by using (is_deleted,quantity).

Determine the index length used in the execution plan: the length calculation formula of key_len (> = 5.6.4)

Char (10) allows NULL = 10 * (character set:utf8mb4=4,utf8=3,gbk=2,latin1=1) + 1 (NULL) char (10) does not allow NULL = 10 * (character set:utf8mb4=4,utf8=3,gbk=2,latin1=1) varchr (10) allows NULL = 10 * (character set:utf8mb4=4,utf8=3,gbk=2,latin1=1) + 1 (NULL) + 2 (variable length field) varchr (10) disallows NULL = 10 * (character set:utf8mb4=4,utf8=3,gbk=2) Latin1=1) + 2 (variable length field) int allows NULL = 4 + 1 (NULL) int does not allow NULL = 4timestamp allows NULL = 4 + 1 (NULL) timestamp does not allow NULL = 4datatime allows NULL = 5 + 1 (NULL) datatime does not allow NULL = 5

3 be influenced by others

After using the index, it is still revealed that 20 million rows are scanned:

Index fields are highly differentiated:

During the same period, regular SQL has become a slow query:

DB data disk access:

Check the situation of other instances of shared physical machines, and it is found that there is a library with a lot of slow sql to sort near the problem time, and the write temporary file is just written to 2GB:

Multiple MySQL instances leader nodes are mixed and deployed on the same physical machine. Although CPU, MEM and other resources are isolated through docker, buffer io isolation has not been achieved yet.

4 can not be solved

Through the summary and analysis of high-frequency queries and combined with business to get the appropriate index can often solve the daily slow queries, but this is not a panacea.

For example, it is possible to add more and more indexes, or even become like this:

In some scenarios, such as supporting multiple field combination queries without required entries, it is obviously unreasonable to support them all through an index.

In the query scenario, it is a good habit to set the highly differentiated fields as required; when there are many query combinations, consider using search storage or search engines with better support.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.