How to deeply understand select (*) 07/06 Update SLTechnology News&Howtos

How to deeply understand select (*)

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you about how to deeply understand select count (*). The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

SELECT COUNT (*) FROM t is a common SQL requirement. In the usage specification of MySQL, we generally use transaction engine InnoDB as the storage engine of (general business) tables. Under this premise, the time complexity of COUNT (*) operation is O (N), where N is the number of rows of the table.

On the other hand, the number of rows in the MyISAM table can be fetched quickly. What is the mechanism behind these practical experiences, and why it is necessary / can be so, is what this article wants to discuss.

Let's take a look at the overview: some of the problems of MySQL COUNT (*) in the two storage engines:

With these issues, the InnoDB storage engine is mainly discussed below.

1. InnoDB full table COUNT (*)

Main questions:

What is the implementation process?

How to calculate count? What are the factors that affect the results of count?

Where is the count value stored? What is the data structure involved?

Why can InnoDB implement count (*) only by scanning the table? (see the questions in this article)

What is the risk of full table COUNT (*) as a case for table scan type operations?

Is it possible for the COUNT (*) operation to read the overflow pages involved in large fields like SELECT *?

1. Execution frame-Loop: read + count

1.1 basic conclusions

Full table scan, a loop to solve the problem.

Inside the loop: read a row before deciding whether the row is counted as count.

Within the loop, there is a line for counting.

1.2 description

Simple SELELCT-SQL implementation framework, analogy INSERT INTO. SELECT is the same process.

The following steps refine how to read and count (count++).

two。 Execution process

Quote: the implementation process is divided into four parts:

COUNT (*) pre-flow: to pave the way for some of the following statements before issuing SQL statements from the Client side to the MySQL-Server side before executing the SELECT.

COUNT (*) process: the process framework at the code level and the key call stack parts of the two core steps are briefly given.

Read one line: visibility and row_search_mvcc functions that describe how visibility affects COUNT (*) results.

Counting a row: whether the Evaluate_join_record and column are empty, describes how the counting process affects the COUNT (*) result.

If the reader wants to see directly how to do COUNT (*), you can also ignore (1) and skip to (2) to start reading.

2.1 COUNT (*) pre-process recall-send SQL from Client to sub_select function

To make the call less obtrusive, let's recall how it was executed to the sub_select function:

The 1.MySQL-Client sends SQL statements, which are sent in packets according to the MySQL communication protocol.

The 2.Mysql-Server side receives the data packet, and the protocol parses the command type (QUERY) and the SQL statement (string).

The 3.SQL statement parses the objects output to the JOIN class by the parser and is used to express the SQL statement structurally.

PS: the JOIN structure here is not only a pure grammatical structure, but also has been semantically processed. Roughly speaking, it summarizes the grammatical structure of table (table_list), target column (target_list), WHERE condition, subquery and so on. In the full table COUNT (*)-case, table_list = [table "t" (alias is also "t"), target_list = [target column object (column name is "COUNT (*)"), of course, there are no WHERE conditions, subqueries, and other structures.

The 4.JOIN object has two important methods: JOIN::optimize () and JOIN::exec (), which are used for query optimization and query execution, respectively.

Join- > optimize (), optimization phase (this point will be covered by the full table count (*) operation under myisam later). Join- > exec (), execution phase (key), including the execution flow of the full table count (*) operation under InnoDB.

5. Join-> exec () after several calls, the sub_select function is called to execute a simple SQL, including COUNT (*).

6.END of sub_select .

2.2 COUNT (*) process (in the sub_select function)

The upper layer of the process and code is relatively simple, concentrated in the sub_select function, in which two types of functions correspond to the two steps described in the previous "execution Framework" section-reading and counting. The conclusions are as follows:

1. Read one line: after a call from the relative top-level sub_select function, all branches will eventually be called into the row_search_mvcc function, which is used to read a line from the B+-tree structure stored by the InnoDB storage engine to a buf (uchar *) in memory for subsequent processing.

two。 This will involve the acquisition of row locks, MVCC, and row visibility. Of course, for snapshot reads such as SELECT COUNT (*), only MVCC and its visibility are involved, not row locks. For more information, skip to the "visibility and row_search_mvcc functions" section.

3. Count one line: at the code level, the lines read will be evaluated in the evaluate_join_record function to see if they should be counted in count (that is, whether or not to count++).

To put it simply, COUNT (arg) itself is a function operation of MySQL. For a row, if the value of the parenthesized parameter arg (a column or the whole row) is not NULL, count++, will not count the row otherwise. For more information, skip to the section "whether Evaluate_join_record and column are empty".

The effects of these two stages on the results of COUNT (*) are as follows: (two-layer filtering)

The code related to the SQL layer process framework is summarized as follows:

Q: at the code level, the * * step (reading a line) has 2 branches. Why?

A: from the InnoDB interface level, it is divided into "read * line" and "read next line", which are two different execution processes. To read * line, you need to find a (cursor) location and do some initialization work to make the subsequent process recursive.

Just as if we use a script / program to scan the table line by line, the following two SQL will be involved in the implementation:

/ / SELECT id FROM t LIMIT 1; OR SELECT MIN (id)-1 FROM t;-> $last_id// SELECT id FROM t WHERE id > $last_id LIMIT 1

Specifically related to the code of this example, the call relationship between the SQL layer and the storage engine layer. The call stack in the read phase is as follows: (for reference)

We can see that no matter which branch is read, it ends up in the same way as the row_search_mvcc function.

The above is a brief description of the code in LOOP. Let's see how row_search_mvcc and evaluate_join_record output the final count result.

2.3 Line visibility and row_search_mvcc function

Here we mainly look at the impact of row visibility on COUNT (*) through a set of case and several questions.

Q: for SELECT COUNT (*) FROM t or SELECT MIN (id) FROM t operations, do the * row reads read the smallest record in table t (in the leftmost leaf node page of the B+ tree)? (why does ha_index_first also call row_search_mvcc to get the minimum key value?)

A: not necessarily. Even MIN (id) does not necessarily read the smallest row of id, because there is also a problem with row visibility. In fact, index_read fetches the smallest index record visible by the statement within the current transaction. This also reflects that join_read_first and join_read_next "go the same way" to row_search_mvcc as mentioned earlier.

Q: for the question in the figure *, if transaction X is an RU (Read-Uncommitted) isolation level, and C-Insert (100) is completed during the execution of X-count (*) (only 5 or 10 records are scanned), can X-count (*) see 100 records in the subsequent reading process after the transaction C-Insert (100) is completed?

A:MySQL adopts the strategy of "what you read is what you read", that is, X-count (*) can read 100 records later.

2.4 whether the evaluate_join_record and column are empty

Q: how does a certain line count into count?

A: lines read are counted in count in two cases:

1. If the parameter in the COUNT function is a column, it will determine whether the column definition in the read row is Nullable and whether the value of the column is NULL;. If both are yes, it will not be counted as count, otherwise it will be counted as count.

E.g. SELECT COUNT (col_name) FROM t

Col_name can be primary key, unique key, non-unique key, non-indexed field

2. If there is a * in the COUNT, it will determine whether the whole line of this part is NULL. If the judgment parameter is NULL, the line will be ignored, otherwise count++.

E.g-1. SELECT COUNT (*) FROM t

E.g-2. SELECT COUNT (B.*) FROM A LEFT JOIN B ON A.id = B.id

Q: in particular, what about SELECT COUNT (id) FROM t, where the id field is the primary key of table t?

A: it is equivalent to COUNT (*) in effect. Because both COUNT (*) and COUNT (pk_col) have primary keys that fully determine that the request data is not NULL, such COUNT expressions can be used to get the currently visible number of table rows.

Q: optimized operation of InnoDB COUNT (*) at the user level

A: this problem is familiar to the industry. Scanning a non-empty unique key can get the number of rows of the table, but the number of bytes involved may be much less (when the length of the table is much different from the length of the primary key and the unique key), the relative IO cost is much lower.

The relevant call stack references are as follows:

2. Data structure:

Which memory variable is the Q:count value stored in?

After A:SQL parsing, it is stored in the item expressing COUNT (*), ((Item_sum_count*) item_sum)-> count

Review the JOIN structure we mentioned earlier in the "COUNT (*) pre-process" section as shown in the figure below.

That is, the SQL parser structures each SQL statement and expresses it in a JOIN object (join). A list result_field_list is created and populated in the object to hold the result column, and each element in the list is an Item_result_field* object (pointer) of the result column.

In COUNT (*)-case, the result column list contains only one element, (Item_sum_count: public Item_result_field) type object (name = "COUNT (*)"), where the unique member variable count of this class is the request.

3. MyISAM full table COUNT (*)

As the MyISAM engine is not often used in real business, it is briefly described as follows:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

The MyISAM-COUNT (*) operation is an operation with O (1) time complexity.

Each MyISAM table stores a meta information-count value, one in memory and one in the file. The value of the count variable in memory is initialized by reading the count value in the file.

SELECT COUNT (*) FROM t directly reads the value of the count variable corresponding to table t in memory.

The count value in memory and the count value in the file are updated by the write operation, and their consistency is guaranteed by the table-level lock.

The write serialization guaranteed by table-level locks makes the read operations of all user threads either locked or see only one data state at the same time.

Fourth, a few questions

Where did Q:MyISAM and InnoDB start to go their separate ways in the execution of the COUNT (*) operation?

Commonness: the commonness exists in the SQL layer, that is, the data structure after SQL parsing is consistent, and the count variable exists in the Item_sum_count type object as the result column; the process returned to the client is similar-the count variable is assigned and returned to the client through the MySQL communication protocol.

Difference: the count value of InnoDB is calculated during the SQL execution phase, while the MyISAM table itself contains the meta information of the table row_count value in memory. In the SQL optimization phase, a hint is given to the optimizer through the tag of the storage engine, indicating that the storage engine used for the table saves the exact number of rows, which can be obtained directly without entering the executor.

Why can't a row_count variable be maintained like MyISAM in Q:InnoDB?

A: you can see the reason from the MVCC mechanism and row visibility problems. Each transaction may see different rows and its count (*) results may be different; conversely, the MySQL-Server side cannot provide a unified reading view to all user threads at the same time, so it cannot provide a unified count value.

PS: for multiple user threads accessing MySQL (COUNT (*)), there are several factors that determine their respective results:

The data state (initial data state) before the execution of a set of transactions.

The execution sequence of transactions with overlapping time (operation timing, transaction theory shows that serializability of concurrent transaction operations is a necessary condition for correctness).

The isolation level of each transaction (input for each operation).

Among them, 1 and 2 are global or controllable for Server, and only 3 is a unique attribute of the transaction in each user thread, which is uncontrollable on the Server side, so the Server side is also uncontrollable for each COUNT (*) result.

Q:InnoDB-COUNT (*) is a table scan operation, will it crowd out the hot pages needed by other user threads in the existing Buffer Pool from the LRU-list, so that other user threads need to load from disk once more, suddenly increasing IO consumption, which may block existing requests?

A:MySQL has such an optimization strategy that the page of the load of the table scan operation is placed at the junction of the oung/old of the LRU-list (the tail of the LRU is about 3 and 8). In this way, the hot pages needed by the user thread are still in the LRU-list-young area, while the pages with continuous load operations will continue to wash out the pages in the old area, which are considered to be non-hot pages, so it is relatively logical.

PS: personally, I think there is a similar optimization idea, which is to limit the size of the Buffer Pool used in scanning operations to O (1), but doing so requires additional memory management costs.

Does Q:InnoDB-COUNT (*) read overflow pages that store large fields, if any, as SELECT * FROM t does?

A: no. Because InnoDB-COUNT (*) only needs to count rows, and the primary key of each row is definitely not NULL, you only need to read the row data in the primary key index page without reading additional overflow pages.

Blog.didiyun.com/index.php/2019/01/08/mysql-count/ above is the editor for you to share how to understand select count (*), if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.