In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "how to realize count (*) of MySQL". In daily operation, I believe that many people have doubts about how to realize count (*) of MySQL. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "how to achieve count (*) of MySQL". Next, please follow the editor to study!
1. Background
Since we have a higher pursuit of the stability of the MySQL database, there are often questions about whether there is a correct posture for the operation of count (*), or is there anything that can be optimized?
But the answer is cruel, if you have used the correct index, then basically there is no place to optimize. Once a slow query appears, it is a slow query. If you want to change it, you can only count it yourself or do it through other search platforms.
Today, let's take a look at why this is the case, and answer some of the puzzles you will encounter every day.
2. The implementation of count (*) is said to be that the MyISAM engine stores the total number of rows of a table on disk, so it is very efficient to return this number directly when count (*) is executed. But our mysql generally uses the Innodb engine, how does Innodb realize the count operation? The InnoDB engine is more troublesome. When it executes count (*), it needs to read the data out of the engine line by line and accumulate the count. So, as we have more and more records in our table, count (*) becomes slower and slower.
Of course, what we are talking about here is without where conditions, and if we take where conditions, MyISAM is also very slow.
3. The correct way to open it well, first of all, it is not recommended to use count (*) to do statistics-related business on mysql, especially when the table is very large. Well, if the business is relatively small and needs to be launched quickly, then at least make sure that count (*) is equipped with scientific where conditions, and then the table has been scientifically indexed. 1) if count (*) carries the where condition and can override the index, it can be done occasionally. 2) if count (*) takes the where condition, it can go to the index, but needs to return to the table, then this will be more dangerous, especially as the table size expands, it will be a mine after all. 3) if it is pure count (*), or if the where condition does not have any index, it is never recommended! For statistical services, several recommended practices: 1) with self-increasing id, you can use the maximum id to approximately obtain 2) self-counting.
3) aggregation on other data analysis platforms
4. Can you use table statistics instead of count (*) some students ask if they can use the statistics of system tables instead of count in their daily use.
The answer is no. The tableRows here is just a reference value.
The table statistics here are actually obtained using show table status. How did you get this value? We need to understand the sampling and statistical methods of mysql. Why do we need sampling and statistics? Because taking out the whole table row by row statistics, although you can get accurate results, but the cost is too high, so you can only choose "sampling statistics" (so in fact, mysql itself does not have a good method of count (*)). When sampling and counting, InnoDB will select N data pages by default, count the different values on these pages, get an average, and then multiply by the number of pages of the index to get the cardinality of the index. The data table is constantly updated, and the index statistics are not fixed. Therefore, when the number of rows changed exceeds 1max M, it will automatically trigger a redo of index statistics.
Therefore, the estimated value of this sample is very accurate. How accurate is it? official documents say the error could reach 40% to 50%.
4. About those strange count (?)
When looking at some old code queries, we often see count (1), count (id), count (field) and other ways, so they struggle with which is better or worse, is there any difference in performance?
Here, we first need to figure out the semantics of count (). Count () is an aggregate function, and the returned result set is judged row by row. If the parameter of the count function is not NULL, the cumulative value is added by 1, otherwise it is not added. Finally, the cumulative value is returned. 1) the count (primary key id) InnoDB engine traverses the entire table, takes out the id value of each row, and returns it to the server layer. After the server layer gets the id, it determines that it is impossible to be empty, and accumulates by line. 2) the count (1) InnoDB engine traverses the entire table, but does not take values. Server layer for each row returned, put a number "1" in, the judgment is impossible to empty, accumulate by line. 3) count (field) if the "field" is defined as not null, read out the field line by line from the record, determine that it cannot be null, and accumulate by line; if the definition of "field" is allowed to be null, then when executing, it is determined that it may be null, and it is necessary to take out the value and judge that it is not null. 4) count (*) does not take out all the fields, but specially optimizes them without taking values. Count (*) is definitely not null, accumulating by line. So the conclusion is: sorted by efficiency, count (field)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.