How to analyze SQL Optimization in MySQL performance Optimization 07/15 Update SLTechnology News&Howtos

How to analyze SQL Optimization in MySQL performance Optimization

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

How to analyze MySQL performance optimization in SQL optimization, I believe that many inexperienced people do not know what to do, so this paper summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.

Note: with MySQL as the background, a lot of content is applicable to other relational databases at the same time, and some indexing knowledge is required.

Optimization goal

Reduce number of IO

IO is always the most likely bottleneck of the database, which is determined by the responsibility of the database. More than 90% of the time in most database operations is occupied by IO operations. Reducing the number of IO is the first priority in SQL optimization. Of course, it is also the most effective optimization method.

Reduce CPU calculation

In addition to the IO bottleneck, what needs to be considered in SQL optimization is the optimization of CPU operations. Order by, group by,distinct... All are large consumers of CPU (these operations are basically data comparison operations in CPU processing memory). When our IO optimization has reached a certain stage, reducing CPU computing has become an important goal of our SQL optimization.

Optimization method

Change the SQL execution plan

After defining the optimization goal, we need to determine the way to achieve our goal. For the SQL statement, there is only one way to achieve the above two goals, and that is to change the execution plan of SQL, let him "take fewer detours" as far as possible, and try to find the data we need through various "shortcuts", so as to achieve the goals of "reducing the number of IO" and "reducing CPU computing".

Common misunderstandings

Count (1) and count (primary_key) are better than count (*)

In order to count the number of records, many people use count (1) and count (primary_key) instead of count (*), which they think is better, but this is a misunderstanding. For some scenarios, this may result in worse performance, and some special optimizations should be made for the database to count (*) counting operations.

Count (column) and count (*) are the same.

This misunderstanding is common even in many senior engineers or DBA, and many people will take it for granted. In fact, count (column) and count (*) are completely different operations and represent completely different meanings.

Count (column) is a record that indicates how many column fields are not empty in the result set

Count (*) indicates how many records there are in the entire result set

Select a dint b from... It's better than select a. B. C. From. It allows the database to access less data.

This misunderstanding mainly exists in a large number of developers, mainly because they do not know much about the storage principle of the database.

In fact, most relational databases are stored in rows (row), while data access operations are based on a fixed-size IO unit (called block or page), usually 4KB, 8KB. Most of the time, multiple rows are stored in each IO cell, and each row stores all the fields of that row (except for special types of fields such as lob).

So, whether we take one field or multiple fields, in fact, the amount of data that the database needs to access in the table is actually the same.

Of course, there are exceptions, that is, our query can be completed in the index, that is to say, when only two fields of aformab are taken, there is no need to return to the table, while the field c is not in the index used, so we need to go back to the table to get its data. In this case, there will be a big difference in the amount of IO between the two.

Order by must require a sort operation

We know that the index data is actually ordered, and if the data we need is in the same order as an index, and our query is executed through this index, then the database will generally omit the sorting operation and return the data directly, because the database knows that the data has met our sorting needs.

In fact, using indexes to optimize SQL with sorting requirements is a very important optimization means.

Extended reading: there is a more in-depth analysis of the implementation analysis of MySQL ORDER BY, the basic implementation principle of GROUP BY in MySQL and the basic implementation principle of MySQL DISTINCT, especially the first one.

If there is filesort in the execution plan, disk files will be sorted.

In fact, we are not to blame for this misunderstanding, but because of the problems with the words used by MySQL developers. Filesort is the information we might see in the "Extra" column when we use the explain command to view the execution plan of a SQL.

In fact, whenever a SQL statement requires a sort operation, it will display "Using filesort", which does not mean that there will be a file sort operation.

Extended reading: understand the filesort in the output of the MySQL Explain command, which I will introduce in more detail here

basic principle

Convert artificially through conversion function on column_name

It directly causes MySQL (in fact, other databases will have the same problem) that the index cannot be used. If you have to convert, you should convert on the passed parameters.

Conversion by the database itself

If the data type and field type we passed are inconsistent, and we do not do any type conversion processing, MySQL may convert our data by itself, or leave it to the storage engine without processing. In this way, the index cannot be used, resulting in execution plan problems.

The optimization mentioned in the above misunderstanding is through the use of indexes to sort.

Reduce the number of records participating in sorting

Do not sort the data without necessity

...

Try to reduce join.

The advantage of MySQL is simplicity, but this is actually its disadvantage in some ways. The MySQL optimizer is efficient, but because of the limited amount of statistical information, the optimizer is more likely to deviate in its working process. For complex multi-table Join, on the one hand, the optimizer is limited, and the effort on Join is not enough, so the performance is still far away from the predecessors of relational databases such as Oracle. However, if it is a simple single-table query, the gap will be very small or even better than these database predecessors in some scenarios.

Sort as little as possible

Sorting operations consume more CPU resources, so reducing sorting can greatly affect the response time of SQL in scenarios where the cache hit rate is high and the IO capability is sufficient.

For MySQL, there are several ways to reduce sorting, such as:

Try to avoid select *

Many people find it difficult to understand when they see this. Didn't they just say in the misunderstanding that the number of fields in the select clause will not affect the data read?

Yes, most of the time it does not affect the amount of IO, but when we still have order by operations, the fields in the select clause will affect our sorting efficiency to a great extent, which can be described in more detail in my previous article on implementation analysis of MySQL ORDER BY.

In addition, the above misunderstanding also said, but most of the time will not affect the amount of IO, when our query results only need to be found in the index, it will greatly reduce the amount of IO.

Replace subqueries with join as much as possible

Although Join performance is not good, it still has a significant performance advantage over MySQL's subqueries. There has been a big problem with the sub-query execution plan of MySQL. Although this problem has existed for many years, it is common in all stable versions that have been released so far and has not been greatly improved. Although officials have long acknowledged the problem and promised to solve it as soon as possible, at least so far we have not seen which version has solved the problem better.

Try to reduce or.

When there are multiple conditions coexisting with "or" in the where clause, the optimizer of MySQL does not solve the problem of optimizing its execution plan very well. coupled with the unique hierarchical architecture of SQL and Storage of MySQL, its performance is relatively low, so it is more effective to use union all or union (when necessary) instead of "or".

Try to use union all instead of union

The main difference between union and union all is that the former needs to merge two (or more) result sets and then carry out unique filtering operation, which will involve sorting, increasing a large number of CPU operations, and increasing resource consumption and delay. So when we can confirm that duplicate result sets are not possible or don't care about duplicate result sets, try to use union all instead of union.

Filter as early as possible

This optimization strategy is actually the most common in the optimization design of indexes (putting more filterable fields higher).

You can also use this principle in SQL writing to optimize some Join SQL. For example, when we query paging data in multiple tables, we'd better be able to filter the data on one table and page it, and then use the paged result set and another table Join, which can reduce unnecessary IO operations as much as possible and greatly save the time consumed by IO operations.

Avoid type conversion

The "type conversion" here refers to the type conversion that occurs when the type of the column field is inconsistent with the type of the parameter passed in the where clause:

Priority is given to optimizing SQL with high concurrency rather than performing some "large" SQL with low frequency

For destructive, high concurrency SQL is always greater than low frequency SQL, because if there is a problem with high concurrency SQL, it will crush the system without even giving us any respite. For some SQL which need to consume a lot of IO and respond slowly, due to the low frequency, even if encountered, the most is to let the whole system respond slowly, but at least hold on for a while, so that we have a chance to buffer.

Optimize from the overall situation, not one-sided adjustment

SQL optimization should not be carried out alone, but should fully consider all the SQL in the system, especially when optimizing the execution plan of SQL by adjusting the index.

Explain every SQL running in the database as much as possible

To optimize SQL, we need to know the execution plan of SQL in order to judge whether there is room for optimization and whether there is a problem of implementation plan. After optimizing the SQL running in the database for a period of time, it is obvious that there are few SQL problems, most of which need to be explored, so a large number of explain operations are needed to collect execution plans and determine whether optimization is needed.

Author: Sky.Jian | can be reprinted at will, but be sure to indicate the original origin of the article and the author's information and copyright notice in the form of hyperlink when reprinting.

Link: http://isky000.com/database/mysql-performance-tuning-sql

Attached by blogger:

Point 8 was mentioned in eygle's dba-notes. (still valid in Oracle DB)

The main problem is that accessing news pages through links is so slow that it usually takes tens of seconds to return.

Query the v$session view for process information:

SQL > select sid,serial#,username from v$session where username is not null

Enable sql_trace tracing for the appropriate session:

SQL > exec dbms_system.set_sql_trace_in_session (7284je true)

SQL > exec dbms_system.set_sql_trace_in_session (11pr 214je true)

SQL > exec dbms_system.set_sql_trace_in_session (16pr 1042je true)

After a period of execution, close:

SQL > exec dbms_system.set_sql_trace_in_session (7, 284, false)

SQL > exec dbms_system.set_sql_trace_in_session (11pr 214pl false)

SQL > exec dbms_system.set_sql_trace_in_session (16pyrrine 1042pr false)

Check the trace file to find the sql calls executed by the foreground during the trace. The following statements are found to be suspicious performance bottlenecks:

Select auditstatus,categoryid,auditlevel from

Categoryarticleassign a,category b where b.id=a.categoryid

And articleId=20030700400141 and auditstatus > 0

The query here reads the news according to articleId, but notice that the logical reading is 3892, which is a high number, which caught my attention.

The execution plan from the following similar query tracking shows that full table access is performed:

Then check the table structure to see if there is a valid index. The following output idx_articleid is given to articleid to create, but is not used in the following query

SQL > select index_name,table_name,column_name from user_ind_columns

2 where table_name=upper ('categoryarticleassign')

Check and find that the table is structured as follows:

At this point, it is found that articleid is of type varchar2, while the articleid=20030700400141 given in the query is of type number, and potential data type conversions have taken place in oracle, resulting in index invalidation.

The solution is as simple as putting a single quotation mark around the articleid.

After reading the above, have you mastered how to parse the SQL optimization in MySQL performance optimization? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.