Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Detailed explanation of explain command example of execution plan in MySQL

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Preface

The explain command is the main way to see how the query optimizer decides to execute the query.

This feature is limited and does not always tell the truth, but its output is the best information available and is worth taking the time to understand, because you can learn how the query is executed.

Call EXPLAIN

Adding explain,mysql before select sets a flag on the query that returns information about each step in the execution plan when the query plan is executed instead of executing it.

It returns one or more lines of information, showing each part of the execution plan and the order of execution.

This is a simple explain effect:

Each table in the query has only one row in the output, and if the query is a join of two tables, there will be two rows in the output.

The alias form counts as a table, so if you join a table with yourself, there will be two rows in the output.

The meaning of "table" is quite broad here, and it can be a subquery, an union result, and so on.

At the same time, there are two varieties of explain

EXPLAIN EXTENDED tells the server to "reverse compile" the execution plan as a select statement.

You can see this generated statement by running show warnings immediately thereafter, which comes directly from the execution plan rather than the original SQL statement, which has become a data structure at this point.

In most scenarios, it is different from the original sentence, and you can check how Citigroup converts the sentence.

EXPLAIN EXTENDED is available in mysql version 5. 0 and an filtered column is added in 5. 1.

EXPLAIN PARTITIONS displays the partition that the query will access, if the query is based on the partition table.

It will exist in versions of mysql 5.1 or later.

EXPLAIN restrictions:

Explain won't tell you how triggers, stored procedures, or UDF affect queries at all.

Stored procedures are not supported, although queries can be extracted manually and explain them separately

It doesn't tell you about the specific optimizations that mysql made in the execution plan.

It does not display all the information about the execution plan of the query

It does not distinguish between things with the same name. For example, it uses "filesort" for both memory arrangement and temporary files, and displays "Using temporary" for both temporary tables on disk and in memory.

It can be misleading, for example, it displays a full index scan for a query with a small limit (explain of mysql 5.1 shows more accurate information about the number of rows checked, but earlier versions did not take into account limit)

Rewrite non-SELECT queries

Mysql explain can only interpret select queries, not stored program calls and insert, update, delete, or other statements. However, you can rewrite some non-select queries to take advantage of explain. To do this, you only need to convert the statement to an equivalent select that accesses all the same columns, and any column of any size must be in a select list, association clause, or where clause.

Suppose you want to rewrite the following update statement so that it can take advantage of explain

UPDATE sakila.actorINNER JOIN sakila.film_actor USING (actor_id) SET actor.last_update=film_actor.last_update

The following explain statement is not equivalent to the above update, because it does not require the server to get the last_update column from any table

This difference is very important. For example, the output shows that mysql will use the override index, but when retrieving and updating the last_updated column, the override index cannot be used, and the following rewriting is closer to the original statement:

Rewriting a query like this is not very scientific, but it is good enough to help understand how the query is done.

(MySQL 5.6will allow interpretation of non-SELECT queries)

When displaying a query plan, it is important to understand that there are no "equivalent" read queries for writing queries. A SELECT query only needs to find a copy of the data and return it. Any query that modifies data must find and modify all its copies on all indexes, which is often much more expensive than a seemingly equivalent SELECT query.

Columns in EXPLAIN

The meaning of each column in the explain result is shown in the next section.

The lines in the output appear in the order of the parts of the query actually executed by the mysql, which is not always the same as they were in the original SQL.

[ID column]

This column always contains a number that identifies the row to which the select belongs. If there is no subquery or union in the statement, then there will only be a unique select, so each row will display a 1 in this column, otherwise, the inner select statements will generally be numbered sequentially, corresponding to their position in the original statement.

Mysql classifies select queries into simple and complex types, and complex types can be divided into three categories: simple subqueries, so-called derived tables (subqueries in the from clause), and union queries.

Here is a simple subquery:

The subqueries and unions in the from clause add more complexity to the id column.

Here is a basic subquery in the from clause:

As you know, this query executes with an anonymous temporary table, which is referenced internally by the mysql alias (der) in the outer query, and the ref column can be seen in more complex queries.

Finally, here is a union query:

Notice the third additional row, where the results of union are always placed in an anonymous temporary table, and then mysql reads the results out of the temporary table, which does not appear in the original SQL, so its id is listed as null.

Compared to the previous example (in the from clause that demonstrates the subquery), the temporary table generated from this query appears in the last row of the result, not the first row.

So far these are pretty straightforward, but the mixing of these three types of statements can make the output very complex, as we'll see later.

[select_type column]

This column shows whether the corresponding row is a simple or complex select (if the latter, which of the three complex types it is). The simple value means that the query does not include subqueries and union, and if the query has any responsible subsections, the outermost part is marked as primary, and the other parts are marked as follows:

SUBQUERY

The select contained in the subquery in the select list (in other words, not in the from clause) is marked SUBQUERY

DERIVED

The DERIVED value is used to indicate that the select,mysql contained in the subquery of the FROM clause executes recursively and puts the results in a temporary table. The server internally calls it a derived table because the temporary table is derived from a subquery.

UNION

The second and subsequent select in UNION are marked unoin, and the first select is marked as if it were executed as a partial external query. This is why the first select in union in the previous example appears as primary. If union is included by a subquery in the from clause, its first select is marked derived.

UNION RESULT

The select used to retrieve the results from union's anonymous temporary table is marked UNION RESULT.

In addition to these values, SUBQUERY and UNION can also be marked as DEPENDENT and UNCACHEABLE.

DEPENDENT means that select relies on the data found in the outer query.

UNCACHEABLE means that some features in select prevent the result from being cached in an Item_cache.

Item_cache is not documented and is not the same thing as query caching, although it can be denied by some of the same types of artifacts, such as the RAND () function. )

[table column]

This column shows which table the corresponding row is accessing, and in general, it is quite clear: which table it is, or the list of that table (if an alias is defined in SQL).

You can observe the association order selected by mysql's association optimizer for the query from top to bottom in this column. For example, you can see that the association order selected by mysql in the following query is different from that specified in the statement:

The execution plan of mysql is always the depth-first tree on the left. If you put this plan down, you can read out the leaf nodes in order, which directly correspond to the rows in explain. The previous query plan looks like the following figure:

Derived tables and unions

When there is a subquery or union in the from clause, the table column becomes much more complex, and in these scenarios, there is really no "table" to refer to, because the anonymous temporary tables created by mysql only exist during query execution.

When there is a subquery in the from clause, the table column is the form of, where N is the id of the subquery. This is always a "forward reference"-in other words, N points to the later line in the explain output.

When there is a union, the table column of union result contains a list of id participating in the union. This is always a "backward reference" because union result appears after all participating lines in union, and if more than 20 id,table Ricanos in the list are truncated to prevent it from being too long, it is not possible to see all the values. Fortunately, you can still speculate about which lines are included, because you can see the id of the first line, and everything that occurs between this line and union result will be included in some way.

An example of a complex select type

Here is a meaningless query that we use here as a compact example of a complex select type

The limit clause is just for convenience, in case you plan to see the result without explain execution.

Here are some of the results of explain:

We deliberately let each query section access a different table so that we can figure out what the problem is, but it is still difficult to solve, starting from the above:

Line 1 refers to der_1 forward, and this query is marked as line 2 in the original SQL. To see which lines in the output refer to the select statement in, read on.

Line 2, whose id is 3 because it is part of the third select in the query, is classified as derived because it is nested inside the subquery in the from clause and is row 4 in the original sql.

Line 3 has an id of 2, which is line 3 in the original sql. Note that it is reasonable to imply execution after the line with a higher id. It is classified as DEPENDENT SUBQUERY, meaning that its results depend on the outer query (that is, a related subquery). The external query in this example starts at line 2, retrieving the select of the data from the der_1.

Row 4 is classified as union, meaning it is the second or later select in union, and its table is, which means that data is retrieved from a subquery of the clause from and attached to the temporary table of union. As before, to find the explain line that shows the query plan for this subquery, read on.

Line 5 is the der_ 2 subquery in line 8 of the original sql, which explain calls it.

Line 6 is a normal subquery in the select list with an id of 7, which is very important.

…… Because it is larger than 5, and 5 is the id of line 7. Why is it important? Because it shows the boundaries of the subquery. When explain outputs a line with a select type of derived, it indicates the beginning of a "nested range". If the id of subsequent lines is smaller (in this case, 5 is less than 6), it means that the nesting range has been turned off. This lets us know that line 7 is the part of the select list from which the data is retrieved-- for example, the first part of the select list on line 4 (line 7 in the original sql). This example is quite easy to understand, and there is no need to know the meaning and rules of nested ranges, although sometimes it is not so easy. Another thing to note about this line in the output is that it is listed as UNCACHEABLE SUBQUERY because of the user variable.

The last line, union result, represents the phase in which rows are read from the temporary table of union. You can start with this line and reverse it, and if you like, it returns the row results where id is 1 and 4, which refer to and, respectively.

As you can see, these complex combinations of select types can make the output of explain quite difficult to understand, and understanding the rules will make it easier, but it will still take a long time.

Reading the output of explain often requires jumping around in the list. For example, if you look at the first line of output and just stare at it, you can't know that it is part of union. You won't understand it until you see the last line.

[type column]

The mysql user manual says that this column shows the "association type", but we think the more accurate term is the access type-in other words, mysql decides how to look up the rows in the table. Here are the most important access methods, from the worst to the best:

ALL:

This is called a full table scan, which means that mysql must scan the entire table, from beginning to end, to find the desired rows. (there is an exception, such as using limit in a query, or displaying "Using distinct/not exists" in the extra column.

Index:

This is the same as a full table scan, except that mysql scans tables in index order rather than rows, and its main advantage is to avoid sorting; the biggest disadvantage is that it has to bear the overhead of reading the entire table in index order. This usually means that if the rows are accessed in random order, it will be very expensive.

If you see "Using index" in the extra column, mysql is using index overrides. It scans only the data of the index, not every row in index order, which is much less expensive than a full table scan in index order.

Range:

A range scan is a limited index scan that starts with a point in the index and returns rows that match that range, which is better than a full index scan, because it does not have to traverse all indexes, the obvious scan is a query with between or > in the where clause.

When mysql uses an index to find a series of values, such as in () and or lists, it also appears as a range scan, however, the two are actually quite different access types, with important differences in performance.

The cost of this type of scan is comparable to that of the index type.

Ref:

This is an index access (sometimes called index lookup) that returns all rows that match a single value. However, it may find multiple rows that match the criteria, so it is a mixture of lookup and scanning. This type of index access occurs only when a non-unique index or a non-unique index prefix is used. It is called ref because the index is compared to a reference value. This reference value is either a constant or a result value from the previous table of the multi-table query.

Ref_or_null is a variant of ref, which means that mysql must do a second search in the results of the initial search to find the null entry.

Eq_ref:

With this index lookup, mysql knows that at most one qualified record is returned, and this access method can be seen when mysql uses a primary key or unique index lookup, which compares them to a reference value. Mysql optimizes this type of access very well because it knows that there is no need to estimate the range of matching rows or to continue searching after finding matching rows.

Const,system:

When mysql can optimize a part of a query and convert it to a constant, he uses these access types. For example, if you select the primary key of a row by putting the primary key of that row into the where clause, mysql can convert the query to a constant and then efficiently remove the table from the join execution.

Null:

This access means that mysql can decompose query statements during the optimization phase, without even having to access tables or indexes during the execution phase. For example, selecting a minimum value from an index column can be done by looking up the index separately, without having to access the table at execution time.

[possible_keys column]

This column shows which indexes the query can use, based on the columns accessed by the query and the comparison operators used. This list is created early in the optimization process, so some of the listed indexes may not be useful for subsequent optimization processes.

[key column]

This column shows which index mysql decides to use to optimize access to the table. If the index does not appear in the possible_keys column, then mysql chooses it for another reason-- for example, it may choose an override index, even if there is no where clause.

In other words, possible_keys reveals which indexes contribute to efficient lookups, while key shows which indexes are optimized to minimize query costs. Here is an example:

[key_len column]

This column shows the number of bytes used by mysql in the index. If mysql is using only certain columns in the index, you can use this value to calculate which columns. Remember, mysql 5.5 and previous versions can only use the leftmost prefix of the index. For example, if the primary key of film_actor is two smallint columns and each smallint column is two bytes, then each item in the index is 4 bytes. The following is an example of a query:

Based on the key_len column in the result, it can be inferred that the query uses the unique first column, the actor_id column, to perform an index lookup. When we calculate the column usage, we must take the character set page in the character column into account.

View the execution plan:

The average length of this query is 13 bytes, which is the total length of columns an and b, column an is 3 characters, each column under utf8 is at most 3 bytes, and column b is a 4-byte integer.

Mysql does not always show how much an index actually uses; for example, if a like query is performed on a prefix pattern match, it shows that the full width of the column is being used.

The key_len column shows the maximum possible length in the index field, rather than the actual number of bytes used by the data in the table. In the previous example, mysql always displays 13 bytes, even if column a happens to contain only one character length. In other words, the key_len is calculated by looking up the definition of the table, not the data in the table.

[ref column]

This column shows the columns or constants that the previous table used to find values in the index of the key column record. Here is an example of a combination of association conditions and aliases. Note that the ref column reflects how the film table aliases f in the query text:

[rows column]

This column is mysql's estimate of the number of rows to be read in order to find the required rows. This number is the number of loops in the embedded loop association plan, that is, it is not the number of rows that mysql thinks it will eventually read from the table, but the average number of rows that mysql must read in order to find those rows that meet the criteria at each point of the query. This standard includes the conditions given in sql and the current columns from the previous table in the join order. )

Depending on the statistics of the table and the selection of the index, this estimate may be inaccurate, and it does not reflect the limit clause in mysql5.0 and earlier versions. For example, the following query does not actually check 1057 rows.

By multiplying the values of all the rows columns, you can roughly estimate the number of rows that the entire query will check; for example, the following query will check about 2600 rows.

Keep in mind that this number is the number of rows that mysql thinks it wants to check, not the number of rows in the result set, and recognize that there are many optimizations, such as associative buffers and caches, that cannot affect the display of the number of rows, mysql may not have to actually read all the rows it estimates, and it doesn't know anything about the operating system or hardware cache.

[extra column]

This column contains additional information that is not suitable for display in other columns. Most of the values that can appear here are recorded in the mysql user manual.

The most common and important values are as follows.

"Using index"

This value indicates that mysql will use an override index to avoid accessing the table. Don't confuse the override index with the index access type.

"Using where"

This means that the mysql server will filter after the storage engine retrieves the rows, and many where conditions involve columns in the index, and when (and if) it reads the index, it can be verified by the storage engine, so not all queries with where clauses display "Using where". Sometimes the emergence of "Using where" is a hint that queries can benefit from different indexes.

"Using temporary"

This means that mysql uses a temporary table when sorting query results.

"Using filesort"

This means that mysql uses an external index to sort the results instead of reading rows from the table in indexed order. Mysql has two file sorting algorithms, both of which can be done in memory or on disk. Explain won't tell you which file sorting mysql will use, nor will it tell you whether the sorting will be done in memory or on disk.

"Range checked for each record (index map: n)"

This means that there is no useful index, the new index will be re-evaluated on each row of the join, N is the bitmap of the index displayed in the possible_keys column, and it is redundant.

Output in tree format

Mysql users often prefer to format the output of explain into a tree to show the execution plan more accurately. In fact, the way explain looks at the execution plan is a bit clumsy, and the tree structure is not suitable for tabulated output, and the disadvantage is more obvious when there are a large number of values in the extra column, as well as using union. Union is not quite the same as other types of joins that mysql can do, and it is not suitable for explain.

An execution plan using a tree structure is also feasible if you have a good understanding of the rules and features of explain. But it's a little boring, and it's best to leave it to automated tools, and Percona Toolkit includes pt-visual-explain, which is one such tool.

Improvements in MySQL 5.6

Mysql5.6 will include an important improvement to explain: the ability to interpret queries like update, insert, and so on, and although you can convert dml statements into equivalent "select" queries and explain, the results do not fully reflect how the statements are executed, so this is still very helpful. We have tried to use that technique when developing pt-upgrade similar to that in Percona Toolkit, and we have found more than once that when converting a query to select, the optimizer does not follow the code path we expected. Therefore, it is very helpful for us to understand what happens during execution by explain a query without the need to convert it to select.

Mysql5.6 will also include a series of changes to the query optimization and execution engine to allow anonymous temporary tables to be materialized as late as possible, rather than always creating and populating them when optimizing and executing partial queries that use this temporary table, which will allow mysql to interpret query statements for subqueries directly without actually executing subqueries first.

Finally, mysql5.6 will improve the relevant parts of the optimizer by adding optimization tracking to the server. Zhejiang allows users to view the choices that the optimizer sits out, as well as the input (for example, the cardinality of the index) and the reasons for the choice. This is very helpful, not only for understanding the execution plan selected by the server, but also for why you chose it.

Summary

The above is the whole content of this article, I hope that the content of this article has a certain reference and learning value for your study or work, if you have any questions, you can leave a message and exchange, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report