Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the cause of select * inefficiency?

2025-01-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

The main content of this article is to explain "what are the reasons for the inefficiency of select *". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what are the reasons for the inefficiency of select *"?

1. The reasons for low efficiency are as follows:

Increase the parsing cost of the query parser.

The addition and subtraction fields are easily inconsistent with the resultMap configuration.

Useless fields increase network consumption, especially fields of type text.

Several reasons are generally mentioned in the development manual, let's take a closer look: 1. Unwanted columns increase data transfer time and network overhead

Using "SELECT *" database needs to parse more objects, fields, permissions, attributes and other related content, in the case of complex SQL statements and more hard parsing, it will cause a heavy burden on the database.

Increase network overhead; * sometimes bring useless and large text fields such as log and IconMD5 by mistake, and the data transmission size will increase geometrically. If the DB and the application are not on the same machine, the overhead is obvious

Even if the mysql server and client are on the same machine and the protocol is still tcp, communication takes extra time.

two。 For useless large fields, such as varchar, blob, and text, io operations will be added. To be exact, when the length exceeds 728 bytes, the excess data will be serialized to another place first, so reading this record will add an io operation. (MySQL InnoDB) 3. Losing the possibility of optimizing the "overlay index" strategy of MySQL optimizer SELECT * eliminates the possibility of overwriting index, while the "overlay index" strategy based on MySQL optimizer is extremely fast and efficient, and the query optimization method is highly recommended by the industry. For example, there is a table that is t (arecrum, b, c, d, and e), where an is the primary key and b has an index. Then, there are two B + trees on disk, namely the clustered index and the secondary index (including single-column index and federated index), which are saved respectively (aforce, b) and (b). If the where condition in the query condition can filter out part of the records through the index of column b, the query will go to the secondary index first, if the user only needs the data of column an and column b The data queried by the user can be known directly through the secondary index. If users use select * to get unwanted data, first filter the data through the secondary index, and then get all the columns through the clustered index, which is one more b + tree query, which is bound to be much slower.

Because the secondary index has much less data than the clustered index, in many cases, the overlay index through the secondary index (all the columns needed by the user can be obtained through the index) does not need to read the disk and is accessed directly from the inside, while the clustered index is likely to have the data on disk (external memory) (depending on the size and hit ratio of the buffer pool). In this case, one is memory read and the other is disk read. The difference in speed is very significant, almost by an order of magnitude.

Second, the extension of index knowledge the auxiliary index is mentioned above. In MySQL, the secondary index includes single-column index and federated index (multi-column federation), so the single-column index will not be discussed. Here, I would like to mention the function of the joint index. the joint index (a), the joint index, the joint index. C is the third-level directory under the secondary directory. To use a directory, you must first use its parent directory, except for the first-level directory. The advantages of the federated index are as follows: 1) reduce the cost of building a joint index (a), (b), (a), (b), (b) and (b). Each additional index increases the overhead of write operations and disk space. For tables with a large amount of data, using federated indexes will greatly reduce the overhead! 2) overlay index pair joint index (SELECT sql), if there is a sql, MySQL can get the data directly by traversing the index without going back to the table, which reduces a lot of random io operations. Reducing io operations, especially random io, is actually the main optimization strategy for DBA. Therefore, in real practical applications, overlay index is one of the main optimization means to improve performance. 3) with high efficiency, there are more index columns, and the less data is filtered by the federated index. For example, a table with 1000W items of data has the following SQL: select col1,col2,col3 from table where col1=1 and col2=2 and col3=3; assumption: assume that 10% of the data can be filtered out for each condition.

a. If there is only a single-column index, you can filter out 1000W10%=100w data through that index, then go back to the table to find col2=2 and col3= 3-compliant data from 100w pieces of data, then sort, paginate, and so on (recursive)

b. If it is a (col1,col2,col3) joint index, you can imagine the improvement in efficiency by screening 1000W 10% * 10% 1w through three-column indexes.

Is it better to build as many indexes as possible? the answer is naturally no.

Tables with small amount of data do not need to be indexed, which will increase the extra index overhead.

Do not index columns that are not often referenced, because they are not often used, even if they are indexed, it doesn't make much sense.

Do not index columns that are updated frequently, as it will definitely affect the efficiency of inserts or updates

Fields whose data are duplicated and evenly distributed, so it is not very effective for him to build an index (for example, gender fields, only men and women, are not suitable for indexing)

Data changes require the maintenance of indexes, which means that the more indexes, the higher the maintenance cost.

More indexes also need more storage space

At this point, I believe you have a deeper understanding of "what is the reason for the inefficiency of select *". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report