What is the principle of MySQL optimization? 07/04 Update SLTechnology News&Howtos

What is the principle of MySQL optimization?

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "what is the principle of MySQL optimization". In daily operation, I believe many people have doubts about what the principle of MySQL optimization is. The editor consulted all kinds of data and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubt of "what is the principle of MySQL optimization?" Next, please follow the editor to study!

The Server layer mainly consists of connectors, retrieval memory, analyzers, optimizers, executors, etc. All cross-storage engine functions are built on this layer, such as stored procedures, triggers, views, functions, etc., and there is a standardized binglog log module.

Storage engine is responsible for data storage and access, using replaceable plug-in architecture, with InnoDB, MyISAM, Memory and other storage engines, of which InnoDB engine has redo log log module. As shown in the following figure

Experimental environment

Operating system kernel version: Tencent tlinux release 2.2

MySQL database version: 5.7.10

Create a new table tb_article and create two indexes: index_title and index_author_id. The table structure is as follows:

Let's try to insert some data:

Now execute the SQL statement, and select * from tb_article where author_id=20 and title='b'; analyzes the execution process and optimization strategy of the SQL statement.

MySQL executes the SQL statement procedure

Communication between MySQL client and server

The client sends the SQL to the server according to the MySQL communication protocol. When the SQL arrives at the server, the server will run a single thread to execute the SQL. The communication protocol between MySQL client and server is "half-duplex".

Second, query status

For MySQL connections, there is a state at any time that indicates what MySQL is currently doing. Use the show full processlist command to view the current status. The state changes many times in a query lifecycle, and here is an explanation of these states:

1. Sleep: the thread is waiting for a new request from the client

2. Query: the thread is executing the query or sending the result to the client

3. Locked: at the MySQL server layer, the thread is waiting for a table lock. Locks implemented at the storage engine level, such as InnoDB's row locks, are not reflected in thread state. This is a typical state for MyISAM.

4. Analyzing and statistics: the thread is collecting statistics for the storage engine and generating an execution plan for the query

5. Copying to tmp table: the thread is executing the query and copying its result set to a temporary table, which is usually either a group by operation, a file sorting operation, or a union operation. If this state is followed by an on disk flag, it means that MySQL is putting a temporary memory table on disk

6. Sorting result: thread is sorting the result set

7. Sending data: a thread may transfer data between multiple states, or generate a result set, or want the client to return data.

3. Query cache

The main function of MySQL cache is to improve the efficiency of queries. Cache is stored in the form of hash tables of key and value. Key is a specific sql statement, and value is a collection of results. If you fail to hit the cache, go on to the parser and return it directly to the client if the cache is hit.

If you use query cache, it will lead to additional resource consumption during read and write operations, and cache will be added and invalidated frequently in an environment with more writes and less reads. The MySQL8.0 version begins to cancel the query cache.

Fourth, query optimization processing

The next step in the lifecycle of the query is to transform an SQL into an execution plan according to which MySQL interacts with the storage engine. This includes multiple sub-phases: parsing SQL, preprocessing, and optimizing SQL execution plan. Any error in this process may terminate the query.

1. Syntax parser and preprocessing: first, MySQL parses SQL statements through keywords and generates a corresponding "parsing tree". The MySQL parser validates and parses the query using MySQL syntax rules; the preprocessor further checks whether the parsed number is legal based on some MySQL rules.

two。 Query optimizer: when the syntax tree is considered legal, and the optimizer converts it into an execution plan. A query can be executed in many ways, all of which return the same result. The role of the optimizer is to find the best execution plan.

3. Execution plan: MySQL does not generate query bytecode to execute the query, MySQL generates an instruction tree of the query, and then completes the instruction tree through the storage engine execution and returns the results. The final execution plan contains all the information about the refactoring query.

Query execution engine

In the parsing and optimization phase, MySQL generates the corresponding execution plan for the query, and MySQL invokes the API of the storage engine to execute the query according to the execution plan generated by the optimizer.

Return the result to the client

To understand the performance and optimization strategy of select * from tb_article where author_id=20 and title='b';, the explain command is generally used for analysis.

MySQL explain

MySQL Query Optimizer gets an execution plan of Query in the database in the current state by executing the explain command. There are 10 columns of information from expain, which are id, select_type, table, type, possible_keys, key, key_len, ref, rows, Extra

The possible occurrence of these fields is explained below:

1. Id

The number of the id column is the serial number of the select, there are several id for every select, and the order of the id is increased in the order in which the select appears. MySQL divides select queries into simple queries and complex queries. Complex queries are divided into three categories: simple subqueries, derived tables (subqueries in from statements), and union queries.

2.select_type

(1) SIMPLE (simple SELECT, no UNION or subquery, etc.)

(2) PRIMARY (if the query contains any complex subsections, the outermost select is marked PRIMARY)

(3) UNION (the second or subsequent SELECT statement in UNION)

(4) DEPENDENT UNION (the second or subsequent SELECT statement in UNION, depending on the external query)

(5) UNION RESULT (result of UNION)

(6) SUBQUERY (the first SELECT in the subquery)

(7) DEPENDENT SUBQUERY (the first SELECT in the subquery, depending on the external query)

(8) DERIVED (SELECT of derived table, subquery of FROM clause)

(9) UNCACHEABLE SUBQUERY (the result of a subquery cannot be cached and the first line of the outer link must be reevaluated).

3. Table

This column indicates which table the row of explain is accessing. When there is a subquery in the from clause, the table column is in format, indicating that the current query depends on the query of id=N, so execute the query of id=N first. When there is union, the value of the table column of UNION RESULT is, and 1 and 2 represent the select row id that participates in union.

4. Type

Represents the way in which MySQL finds the required rows in the table, also known as "access type".

Common types are: ALL, index, range, ref, eq_ref, const, system, NULL (from left to right, poor performance to good)

ALL:Full Table Scan, MySQL will traverse the entire table to find matching rows

Index: Full Index Scan,index differs from ALL in that index only traverses the index tree

Range: only retrieve a given range of rows, using an index to select rows

Ref: indicates the join matching condition of the above table, that is, which columns or constants are used to find the value on the index column. Compared to eq_ref, instead of using a unique index, but using a partial prefix of a normal index or a unique index, the index compares to a value and may find multiple rows that match the criteria.

Eq_ref: similar to ref, the difference is that the index used is the unique index. For each index key value, only one record in the table matches. To put it simply, primary key or unique key is used as the association condition in multi-table joins.

Const, system: use these types of access when MySQL optimizes some part of the query and converts it to a constant. If you put the primary key in the where list, MySQL can convert the query to a constant. System is a special case of const type. Use system when the query table has only one row.

NULL: MySQL breaks up statements during optimization and executes without even accessing a table or index. For example, selecting a minimum value from an index column can be done through a separate index lookup.

5. Possible_keys

This column shows which indexes the query might use to find it.

When explain, it is possible that possible_keys has columns, while key displays NULL, which is because there is not much data in the table. MySQL thinks that the index is not very helpful to this query and chooses a full table query.

If the column is NULL, there is no associated index. In this case, you can improve query performance by checking the where clause to see if an appropriate index can be created, and then use explain to see the effect.

6. Key

The key column shows the key (index) that MySQL actually decides to use.

If no index is selected, the key is NULL. To force MySQL to use or ignore indexes in possible_keys columns, use FORCE INDEX, USE INDEX, or IGNORE INDEX in the query.

7. Key_len

Represents the number of bytes used in the index, which can be used to calculate the length of the index used in the query (the value displayed by key_len is the maximum possible length of the index field, not the actual used length, that is, the key_len is calculated based on the table definition, not retrieved within the table).

Without losing accuracy, the shorter the length, the better.

8. Ref

Represents the join matching criteria for the above table, that is, which columns or constants are used to find values on index columns.

9. Rows

Indicates that MySQL estimates the number of rows that need to be read to find the required records based on table statistics and index selection, which is not the number of rows in the result set.

10. Extra

This column contains the details of the MySQL resolution query, and there are several situations:

Using index: this occurs when the request columns on the table are all parts of the same index, and the returned column data only uses the information in the index without accessing the row records in the table, which is a high performance performance.

Using where: column data is returned from a table that only uses the information in the index and does not read the actual action, which occurs when all the request columns on the table are part of the same index, indicating that the mysql server will filter after the storage engine retrieves the rows.

Using temporary: indicates that MySQL needs to use temporary tables to store result sets, which are common in sorting and grouping queries.

The sort operation in Using filesort:MySQL that cannot be done with an index is called "file sort," and the results are sorted using an external index instead of reading rows from the table in index order. At this point, mysql browses all eligible records according to the join type, saves the sort keywords and row pointers, and then sorts the keywords and retrieves the row information in order. In this case, it is generally necessary to consider using indexes to optimize.

Using join buffer: the change emphasizes that the index is not used when getting the join condition and that a connection buffer is required to store the intermediate results. If this value occurs, it should be noted that indexes may need to be added to improve energy depending on the specific circumstances of the query.

Impossible where: this value emphasizes that the where statement results in no rows that match the criteria.

Execute explain statement

Explain select * from tb_article where author_id=20 and title='b'

As you can see, the execution of this SQL statement does not actually walk the index_title index, but chooses to take the index_author_id index.

Turn on the optimizer trace feature:

SET optimizer_trace= "enabled=on"; select * from information_schema.optimizer_trace\ G

The execution plan finally selects the index_author_id index because the cost of index_author_id is less than index_title. Here we need to introduce the cost calculation model of MySQL.

MySQL cost model

Total cost model: COST = CPU Cost + IO Cost

The cost model of MySQL divided into IO, CPU and Memory,MySQL5.7 in cost type is still being perfected. Although the cost of Memory has been collected, it has not been calculated in the final cost.

MySQL5.7 has done a lot of refactoring of the cost model in the source code, and the cost is divided into server layer and engine layer. The server layer is mainly the CPU cost, while the engine layer is mainly the IO cost. MySQL5.7 introduces two system tables, mysql.server_cost and mysql.engine_cost, to configure the cost of these two layers respectively.

The following analysis is based on MySQL5.7.10

Server_cost

1. Row_evaluate_cost (default 0.2) calculates the cost of qualified rows. The more rows, the greater the cost.

2. Memory_temptable_create_cost (default 2.0) the cost of creating temporary tables in memory

3. Row cost of memory_temptable_row_cost (default 0.2) memory temporary table

4. Cost of key_compare_cost (default 0.1) key comparison, such as sorting

5. Creation cost of internal myisam or innodb temporary tables in disk_temptable_create_cost (default 40.0)

6. Row cost of disk_temptable_row_cost (default 1. 0) internal myisam or innodb temporary tables

You can see that the cost of creating temporary tables is high, especially internal myisam or innodb temporary tables.

Engine_cost

1. Io_block_read_cost (default 1. 0) cost that reads data from disk. For innodb, it means reading the cost of a page from disk.

2. Memory_block_read_cost (default 1.0)

The cost of reading data from memory, for innodb, means reading a cost of page from buffer pool.

Currently, the default values of both io_block_read_cost and memory_block_read_cost are 1. In actual production, it is recommended to increase the memory_block_read_cost as appropriate, especially for ordinary hard disk scenarios.

Create a composite index index_title_author on table tb_article

ALTER TABLE tb_article ADD KEY index_title_author (`title`, `author_ id`); select * from tb_article where author_id=20 and title='b'

The cost of index_author_id is equal to that of index_title_author, and MySQL will give priority to indexes with fewer leaf blocks.

For SQL statements: select title, author_id from tb_article where author_id=20 and title='b'

MySQL will prefer the composite index index_title_author because index_title_author is an index coverage scan and does not need to return to the table, resulting in high performance.

At this point, the study of "what is the principle of MySQL optimization" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.