Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the reasons for the inefficiency of mysql's SELECT?

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

What are the reasons for the inefficiency of mysql's SELECT? most people do not understand the knowledge points of this article, so the editor summarizes the following content, detailed content, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this article, "what are the reasons for mysql's low SELECT efficiency?"

First, the reasons for low efficiency

First, take a look at the description of the MySQL section of the latest Ali java Development Manual (Taishan Edition):

4-1. [force] in a table query, do not use * as the list of fields for the query, and which fields are required must be clearly specified.

Description:

Increase the parsing cost of the query parser.

The addition and subtraction fields are easily inconsistent with the resultMap configuration.

Useless fields increase network consumption, especially fields of type text.

Several reasons are generally mentioned in the development manual, so let's take a closer look:

1. Unwanted columns increase data transfer time and network overhead

Using "SELECT *" database needs to parse more objects, fields, permissions, attributes and other related content, in the case of complex SQL statements and more hard parsing, it will cause a heavy burden on the database.

Increase network overhead; * sometimes bring useless and large text fields such as log and IconMD5 by mistake, and the data transmission size will increase geometrically. If the DB and the application are not on the same machine, the overhead is obvious

Even if the mysql server and client are on the same machine and the protocol is still tcp, communication takes extra time.

two。 For useless large fields, such as varchar, blob, text, io operations will be added

To be exact, when the length exceeds 728 bytes, the excess data will be serialized to another place first, so reading this record will add an io operation. (MySQL InnoDB)

3. The possibility of losing the optimization of the override index strategy of the MySQL optimizer

SELECT * eliminates the possibility of overwriting the index, and the "overlay index" strategy based on MySQL optimizer is extremely fast and efficient, and the query optimization method is highly recommended by the industry.

For example, there is a table that is t (arecrum, b, c, d, and e), where an is the primary key and b has an index.

Then, there are two B + trees on disk, namely the clustered index and the secondary index (including single-column index and federated index), which are saved respectively (aforce, b) and (b). If the where condition in the query condition can filter out part of the records through the index of column b, the query will go to the secondary index first, if the user only needs the data of column an and column b The data queried by the user can be known directly through the secondary index.

If users use select * to get unwanted data, first filter the data through the secondary index, and then get all the columns through the clustered index, which is one more b + tree query, which is bound to be much slower.

Because the secondary index has much less data than the clustered index, in many cases, the overlay index through the secondary index (all the columns needed by the user can be obtained through the index) does not need to read the disk and is accessed directly from the inside, while the clustered index is likely to have the data on disk (external memory) (depending on the size and hit ratio of the buffer pool). In this case, one is memory read and the other is disk read. The difference in speed is very significant, almost by an order of magnitude.

II. Extension of indexing knowledge

The secondary index is mentioned above. In MySQL, the secondary index includes single-column index and federated index (multi-column federation), so the single-column index will not be discussed. Here we will mention the role of federated index.

Joint index (a _ b _ c)

The joint index (a) has actually established (a), (a), (a) and (a) (b).

We can think of the combinatorial index as the first-level directory, the second-level directory, and the third-level directory of the book, such as index, which means that an is the first-level directory, b is the second-level directory under the first-level directory, and c is the third-level directory under the secondary directory. To use a directory, you must first use its parent directory, except for the first-level directory.

As follows:

Advantages of federated indexes 1) reduce overhead

Building a joint index (a) is actually equivalent to building three indexes (a), (a), (a) and (a). Each additional index increases the overhead of write operations and disk space. For tables with a large amount of data, using federated indexes will greatly reduce the overhead!

2) overwrite index

For the federated index (a _ r _ b _ r _ c), if you have the following sql

SELECT a dint bjorn c from table where a feather xx' and b = 'xx'

Then MySQL can get the data directly by traversing the index without going back to the table, which reduces a lot of random io operations. Reducing io operations, especially random io, is actually the main optimization strategy for DBA. Therefore, in real practical applications, overlay index is one of the main optimization means to improve performance.

3) High efficiency

With more index columns, the less data is filtered through the federated index. For example, a table with 1000W items of data has the following SQL:

Select col1,col2,col3 from table where col1=1 and col2=2 and col3=3

Assumption: assume that 10% of the data can be filtered out for each condition.

a. If there is only a single-column index, you can filter out 1000W10%=100w data through that index, then go back to the table to find col2=2 and col3= 3-compliant data from 100w pieces of data, then sort, paginate, and so on (recursive)

b. If it is a (col1,col2,col3) joint index, you can imagine the improvement in efficiency by screening 1000W 10% * 10% 1w through three-column indexes.

Is the more indexes built, the better?

Of course, the answer is no.

Tables with small amount of data do not need to be indexed, which will increase the extra index overhead.

Do not index columns that are not often referenced, because they are not often used, even if they are indexed, it doesn't make much sense.

Do not index columns that are updated frequently, as it will definitely affect the efficiency of inserts or updates

Fields whose data are duplicated and evenly distributed, so it is not very effective for him to build an index (for example, gender fields, only men and women, are not suitable for indexing)

Data changes require the maintenance of indexes, which means that the more indexes, the higher the maintenance cost.

More indexes also need more storage space

The above is the content of this article on "what are the reasons for the low SELECT efficiency of mysql?" I believe we all have a certain understanding. I hope the content shared by the editor will be helpful to you. If you want to know more about the relevant knowledge, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report