Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Analysis on the implementation of (transfer) MySQL ORDER BY

2025-01-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Author: Sky.Jian | can be reprinted at will, but be sure to indicate the original origin of the article and the author's information and copyright notice in the form of hyperlink when reprinting.

Link: http://www.jianzhaoyang.com/database/mysql_order_by_implement

Generally speaking, there are two ways to sort ORDER BY in MySQL, one is to use ordered index to obtain ordered data, and the other is to sort the data in memory through the corresponding sorting algorithm.

The following will analyze the two sorting implementation methods and implementation diagrams through examples:

Suppose you have two tables, Table An and B, with the following structure:

[@ more@] sky@localhost: example01:48:21; showcreatetableAG

* * 1.row * * Table: ACreateTable: CREATETABLE`A` (`c1`int (11) NOTNULLdefault'0', `c2`char (2) defaultNULL, `c3`varchar (16) defaultNULL, `c4`datetimedefaultnull, PRIMARYKEY (`c1`)) ENGINE=MyISAMDEFAULTCHARSET=utf8

Sky@localhost: example01:48:32; showcreatetableBG

* * 1.row**Table: BCreateTable: CREATETABLE`B` (`c1`int (11) NOTNULLdefault'0', `c2`char (2) defaultNULL, `c3`varchar (16) defaultNULL,PRIMARYKEY (`c1`), key `B _ c2ind` (`c2`) ENGINE=MyISAMDEFAULTCHARSET=utf8

1, the use of ordered index for sorting, in fact, when the ORDER BY conditions of our Query and the Index index keys (or the previous index keys) used in the Query execution plan are exactly the same, and the index access mode is rang, ref or index, MySQL can directly obtain the sorted data by using the index order. ORDER BY in this way is basically the best way to sort, because MySQL does not need to do the actual sorting operation.

Suppose we perform the following SQL on Table An and B:

Sky@localhost: example01:44:28; EXPLAINSELECTA.* FROMA,BWHEREA.c12ANDA.c25ANDA.c2 = B.c2ORDERBYA.c1G

* * 1.row**id: 1select_type: SIMPLEtable: Atype: rangepossible_keys: PRIMARYkey: PRIMARYkey_len: 4ref: NULLrows: 3Extra: Usingwhere

* * 2.row**id: 1select_type: SIMPLEtable: Btype: refpossible_keys: B_c2_indkey: B_c2_indkey_len: 7ref: example.A.c2rows: 2Extra: Usingwhere; Usingindex

We can see from the execution plan that MySQL does not actually perform the actual sorting operation. In fact, the whole execution process is shown in the following figure:

2. Through the corresponding sorting algorithm, the data will be sorted in memory. MySQL ratio needs to sort the data in memory, and the memory area used is the sorting area that we set through the sort_buffer_size system variable. This sort area is unique to each Thread, so there may be multiple sort buffer memory regions in the MySQL at the same time.

The second method is called filesort in the execution plan given by MySQL Query Optimizer (viewed through the EXPLAIN command). In this way, mainly because there is no ordered index available to obtain ordered data, MySQL can only sort the obtained data in memory and then return the data to the client. There are actually two kinds of algorithms to implement filesort in MySQL. One is to take out the corresponding sorting fields and row pointer information that can directly locate row data according to the corresponding conditions, and then sort them in sort buffer. The other is to fetch all the fields that meet the criteria at once and sort them in sort buffer.

Before the MySQL4.1 version, there is only the first sorting algorithm, and the second algorithm is an improved algorithm starting from MySQL4.1, the main purpose of which is to reduce the IO operations that require two visits to table data in the first algorithm, changing two to one, but correspondingly consuming more sort buffer space. Of course, all subsequent versions of MySQL4.1 also support the first algorithm. MySQL mainly determines which sorting algorithm needs to be used by comparing the size of the system parameter max_length_for_sort_data set by us and the sum of the field types taken out by the Query statement. If the max_length_for_sort_data is larger, the second optimized algorithm is used, while the first algorithm is used. So if you want the ORDER BY operation to be as efficient as possible, be sure to set the max_length_for_sort_data parameter. There have been a large number of sorting waits in colleagues' database, resulting in high system load and long response time. Finally, it is found out that it is precisely because MySQL uses the traditional first sorting algorithm. After increasing the max_length_for_sort_data parameter value, the system load is greatly alleviated and the response is much faster.

Let's take a look at an example where MySQL needs to use filesort for sorting.

Suppose we change our Query to sort through A.c2, and then see what happens:

Sky@localhost: example01:54:23; EXPLAINSELECTA.* FROMA,BWHEREA.c12ANDA.c25ANDA.c2 = B.c2ORDERBYA.c2G

* * 1.row**id: 1select_type: SIMPLEtable: Atype: rangepossible_keys: PRIMARYkey: PRIMARYkey_len: 4ref: NULLrows: 3Extra: Usingwhere; Usingfilesort

* * 2.row**id: 1select_type: SIMPLEtable: Btype: refpossible_keys: B_c2_indkey: B_c2_indkey_len: 7ref: example.A.c2rows: 2Extra: Usingwhere; Usingindex

MySQL fetches the qualified data from Table A. Because the data obtained does not meet the ORDER BY condition, MySQL performs the filesort operation. The whole execution process is as follows:

Another strange limitation of filesort operation in MySQL is that its data source must come from one Table, so if our sorted data is obtained by two (or more) Table through Join, then MySQL must first create a temporary table (Temporary Table), and then sort the data of this temporary table, as shown in the following example:

Sky@localhost: example02:46:15; explainselectA.* fromA,BwhereA.c12andA.c25andA.c2 = B.c2orderbyB.c3G

* * 1.row**id: 1select_type: SIMPLEtable: Atype: rangepossible_keys: PRIMARYkey: PRIMARYkey_len: 4ref: NULLrows: 3Extra: Usingwhere; Usingtemporary; Usingfilesort

* * 2.row**id: 1select_type: SIMPLEtable: Btype: refpossible_keys: B_c2_indkey: B_c2_indkey_len: 7ref: example.A.c2rows: 2Extra: Usingwhere

The output of this execution plan is a little strange. For some reason, does MySQL Query Optimizer display the "Using temporary" process in the first line of the operation on Table A just to reduce the output of the execution plan by one line?

The actual execution process should be shown in the following figure:

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report