In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Why is the performance of Insert so poor? in view of this question, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible way.
Recently found that the insertion performance of some systems is not very good, it is true to know that the performance of physical storage is not very good, in the era when key systems are using SSD, we have not yet entered the arms of SSD. But another point is why some places use fee-based SSD devices, in fact, the plug-in performance is good, or if you change the SSD equipment, you can't see the difference. Eliminate the problem of small amount of data, in fact, the database is also needed to optimize the insertion.
So let's analyze what problems will be encountered in inserting it. Here we may not limit certain databases, but discuss them on a large scale. Of course, some problems may be aimed at certain databases, which will be reflected here.
1 question, do we use self-increment or hash to insert data?
In fact, this is a good question, some people say that self-increasing insertion is in line with the attributes of physical data storage in some databases, so search is fast, some people say that hash insertion is fast, I break up the KEY, insert, must be better than the order of the way.
Personally, I don't like the word "certain" very much. After so many years of life, it must be an unreliable word LIST in me.
Let's analyze whether this is the case or not. If it is self-increasing, hot issues will arise when a large amount of data is inserted. Now we have MYSQL as an example.
Thread 1
Insert into table (.) Values (.)
Thread 2
Insert into table select.... From table 2
Let's take a look at the above statement and what will happen if we run it at the same time and we still use the self-increment mode of MYSQL.
Yes, the hotspot of self-adding primary keys, which is the problem that MYSQL was criticized when a large amount of data was inserted before 5.55.00. So how did MYSQL solve it later? here we will talk about the three self-increasing parameters of MYSQL, most of which we choose now.
Innodb_autoinc_lock_mode = 2
What's wrong with such a choice? It is obvious that discontiguous problems may occur when ID is inserted in large numbers.
What can we see from the above statement? an inserted statement uses using where using temporary. Why? You can think about this and think about whether it is a good thing for a system with high-frequency words to run efficiency first if the statement behind the select is a large amount of data. There will be a simple explanation at the end of the article.
In addition, we need to consider that if we don't use self-increment and generate primary key inserts like MONGODB hashes (actually, things like UUID are hashes), and we treat MONGODB's OBJECT ID as a hash (unordered).
The primary key in MONGODB is mainly generated by several aspects, unix time, MONGODB machine code identification, a random number, and so on. Here is a cheap topic. If you want to use the snowflake algorithm, you can consider using MONGODB's OBJECT_ID generation method. Of course, MONGODB's OBJECT_ID considers a lot, so hundreds of millions of logarithms of data will not hit the library.
Here is how MONGODB stores its data, which is similar to the way HEAP tables are stored, of course, because of its non-transactional nature (not to mention transactions, which at best can be regarded as a useful supplement to data operations in some situations and cannot be compared with traditional databases)
So today we talked about a few issues.
1 the insertion of data is related to the way the primary key is generated
2 the speed of data insertion is related to the writing of INSERT statements
3 data insertion is related to additional information (INDEX, foreign keys, additional information for each row, design and storage of PAGE pages) (this point is not mentioned this time)
4 data insertion is related to some additional function operations or some additional information in the data insertion row (not mentioned this time)
5 how the data is inserted and the relationship with the database LOG (not mentioned this time)
All the problems that have not been mentioned will be discussed in an issue.
Finally, a high-frequency insertion system, in each kind of database insertion design, the HOT table should have strict requirements, from the table design, primary key design, table insertion row design, index design, should have consideration, if there is a statement like insert into select in the high-frequency system, generosity is not optimistic about it. Because when inserting, some database systems will insert tables and lock locks like next-key lock.
The answer to the question about why the performance of Insert is so poor is shared here. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.