In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Introduction
Xiao A was writing code in balabala when DBA Xiao B suddenly sent a message, "look at your user-specific information table T, in which the primary key, that is, the self-increasing id, has reached 1.6 billion. How long has it been? after a while, the primary key will be out of range, and the insertion will fail, balabala."
I remember not so much, more than 10 million at the most, count, it was really 11 million. It turns out that the operation and maintenance staff look at it through the value of auto_increment, that is to say, there are a large number of delete and insert operations in the table, but most of my cases are updated. how did this happen?
I won't say much. Let's take a look at the detailed introduction.
Problem troubleshooting
This table is a simple interface service in use. Every day, big data will count a large amount of information, and then push it to Xiao A, who will update the information to the database. If it is new data, it will be inserted, and the old data will be updated. The external interface can only query.
Soon, Xiao A checked his own code, and there was no place to delete or actively insert or update id. How could this happen? Is it the reason for Xiao B? it's not possible that DBA manages a lot of watches. If there's a problem, it would have come out a long time ago, but I have no clue where the problem is.
Xiao A carefully observed the more than 10 million existing data, and took the insertion time and id as the main observation field. Soon, he found a problem. The first item of data inserted every day was always more than 10 million more than the day before, sometimes more and sometimes less. Xiao A pointed the finger at DBA small B and described the problem to Xiao B.
Xiao B asked Xiao A, "did you use the REPLACE INTO... statement?" what's going on? it turns out that REPLACE INTO. Will affect the primary key.
REPLACE INTO... Influence on primary key
Suppose there is a table T1:
CREATE TABLE `t1` (`id` int (10) unsigned NOT NULL AUTO_INCREMENT COMMENT'ID, self-increment', `uid` bigint (20) unsigned NOT NULL DEFAULT'0' COMMENT 'user uid', `name` varchar (20) NOT NULL DEFAULT' 'COMMENT' user nickname', PRIMARY KEY (`id`), UNIQUE KEY `uididxuid` (`uid`) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=' test replace into'
If you create this table and execute the following statement, what about the final data record?
Insert into T1 values (NULL, 100,100, "test1"), (NULL, 101, "test2"); replace into T1 values (NULL, 100,100, "test3")
It turns out that REPLACE INTO... If the data corresponding to the unique index already exists each time it is inserted, the original data will be deleted and the new data will be re-inserted, which will cause the id to grow, but the actual expectation may be to update that data.
"I know replace is like this, so I didn't use it," Xiao A said, but checked again. It's really not his own problem. He didn't use REPLACE INTO.
Xiao A double-checked carefully, still did not find a problem, let small B check the binlog log to see if there is anything strange, after checking still did not find the problem, there is indeed a jump, but there is no substantive problem.
The value of @ 1 in the following figure corresponds to the self-increasing primary key id, with (@ 2, @ 3) as the unique index.
After a long time, Xiao B pointed out a direction for Xiao A, and Xiao A began to doubt his insert update statement INSERT. ON DUPLICATE KEY UPDATE... Yes, I checked for a long time, and sure enough, it was here except for the problem.
INSERT... ON DUPLICATE KEY UPDATE... Influence on primary key
This statement is similar to REPLACE INTO. Similar, but he will not change the primary key of the record, or the table above T1, we execute the following statement, what is the result?
Insert into T1 values (NULL, 100,100, "test4") on duplicate key update name = values (name)
Yes, as expected by Xiao A, the primary key has not increased, and the name field has been updated to what he wants, but there is a hint in the execution result, which attracts Xiao A's attention.
No errors; 2 rows affected, taking 10.7ms
Obviously updated a piece of data, why the number of impact records here is 2? Xiao A, I looked at the auto_increment in the current table again.
CREATE TABLE `t1` (`id` int (10) unsigned NOT NULL AUTO_INCREMENT COMMENT'ID, self-increment', `uid` bigint (20) unsigned NOT NULL DEFAULT'0' COMMENT 'user uid', `name` varchar (20) NOT NULL DEFAULT' 'COMMENT' user nickname', PRIMARY KEY (`id`), UNIQUE KEY `uididxuid` (`uid`) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8 COMMENT=' test replace into'
It's 5`. It's supposed to be 4 here.
In other words, the above statement will follow REPLACE INTO. Similarly, the self-increment ID will be added by 1, but the actual record will not be added. Why?
After checking the data, Xiao A learned that there was a parameter innodb_autoinc_lock_mode added to the mysql primary key. He had three possibilities, which were only added after 0meme 1meme 2meme mysql5.1. The default value was 1, and the previous version could all be seen as 0.
You can use the following statement to see which mode it is currently in
Select @ @ innodb_autoinc_lock_mode
The default value of the database used by Xiao An is also 1. When inserting simply (you can determine the number of rows inserted), add 1 to the auto_increment directly without unlocking the table, which improves performance. When the inserted statement is similar to insert into select. In such a complex statement, you don't know the number of rows to be inserted in advance, so you have to lock the table (a special table lock called AUTO_INC) so that auto_increment is accurate and the lock is not released until the end of the statement. There is also an insert called Mixed-mode inserts, such as INSERT INTO T1 (C1 c2) VALUES, (NULL,'b'), (5pcc'), (NULL,'d'), some of which explicitly specify self-increasing primary key values, some unspecified, and the INSERT we discuss here. ON DUPLICATE KEY UPDATE... Also belong to this type, at this time will analyze the statement, and then according to as many cases as possible to allocate auto_incrementid, how to understand this, I think the following example:
Truncate table T1 insert into T1 values (NULL, 100,100, "test1"), (NULL, 101,101," test2 "), (NULL, 102," test2 "), (NULL, 103," test2 "), (NULL, 104," test2 "), (NULL, 105," test2 ");-- at this time, the next self-increasing id in the data table is 7delete from T1 where id in. -- at this time, there are only 1Power5 and 6 left in the data table, self-increasing id or 7insert into T1 values (2,106, "test1"), (NULL, 107, "test2"), (3,108, "test2");-- what is the self-increasing id here?
After the execution of the above example, the next self-increasing id of the table is 10, do you understand it correctly, because the last one executes a Mixed-mode inserts statement, innoDB analyzes the statement, and then allocates three id, at this time the next id is 10, but the three id assigned are not necessarily used. @ is always late here. Thank you for pointing out that you have misunderstood the official documents.
Mode 0 means that a table lock is added regardless of the situation, and when the statement is executed, it is released. If a record is really added, add 1 to the auto_increment.
As for mode 2, there is no AUTO_INC lock in any case, and there is a security problem. When the binlog format is set to Statement mode, when the slave library is synchronized, the execution result may be inconsistent with the master database, which is a big problem. Because there may be a complex insert, which is still being executed, another insert will come, and the recovery will be performed one by one, so the concurrency problem cannot be reproduced, resulting in the mismatch of the record id.
At this point, the problem of id jump is analyzed, because the value of innodb_autoinc_lock_ mode is 1pm insert. ON DUPLICATE KEY UPDATE... Is a simple statement that can calculate the number of rows affected in advance, so auto_increment is added here by 1 (if multiple lines are greater than 1), whether it is updated or not.
If you change the innodb_autoinc_lock_ mode value to 0, execute INSERT again. ON DUPLICATE KEY UPDATE... If so, you will find that auto_increment does not increase, because this mode directly adds an AUTO_ Inc lock, which is released when the statement is executed, and it is found that if there is no increase in the number of lines, there will be no increase in self-increment id.
INSERT... ON DUPLICATE KEY UPDATE... The number of rows affected is 1. Why return 2?
Why is this so? in theory, the number of lines affected is 1, ah, take a look at the description of the official document.
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values
The official made it clear that the insert affects 1 row, the update affects 2 rows, and 0 exists and has the same value before and after the update. Is it hard to understand?
In fact, if you want to think of it this way, this is to distinguish whether it has been inserted or updated. Returning 1 indicates that the insertion is successful, and 2 indicates that the update is successful.
Solution
Setting innodb_autoinc_lock_mode to 0 will certainly solve the problem, but in that case, the concurrency of the insertion may be greatly affected, so Xiao A will not agree with DBA even if he thinks about it. After consideration, two more possible solutions have been prepared:
Modify business logic
Modify the business logic to INSERT. ON DUPLICATE KEY UPDATE... Statement apart, first to query, and then to update, so as to ensure that the primary key will not be uncontrolled increase, but increased complexity, the original request may become twice, first query whether there is, and then to update.
Delete the self-increasing primary key of the table
Delete the self-increasing primary key and let the unique index do the primary key, so there is basically no need to make any changes. Just make sure that the current self-increasing primary key is of no practical use, so that the efficiency may be affected when inserting and deleting. However, in the case of a large number of queries, Xiao An is more willing to choose the latter after comparing the two.
Conclusion
Actually, INSERT... ON DUPLICATE KEY UPDATE... The number of rows of this influence is 2, and Xiao A discovered it a long time ago, but he didn't keep his curiosity, disapproved of it, and didn't delve into the problems. This deep investigation will bring out a lot of new knowledge, which is very good. It seems that Xiao A still needs to be curious and sensitive to the outside world, so that he can make progress.
Summary
The above is the whole content of this article, I hope that the content of this article has a certain reference and learning value for your study or work, if you have any questions, you can leave a message and exchange, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.