In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article is about how to solve the data loss caused by master-slave writing in MySQL. The editor thinks it is very practical, so I share it with you. I hope you can get something after reading this article. Let's take a look at it.
I. the origin of the problem
Not long ago, there was a loss of data updates in the MySQL database of the user feedback department. In order to solve this problem, the scenarios used by users were analyzed at that time. The discovery may be due to the data loss caused by the user writing on both master and slave machines.
As shown in the figure, it is an example of writing to a database under normal and abnormal circumstances. Later, during a more in-depth investigation of the problem, DBA discovered the cause of data loss caused by the failure:
The restoration of the specific process of the fault is shown in figure 1-2. You can see from the figure that the write operation on step 3 DP overwrites the write on step 4 DA after restoring synchronization from DA to DP. As a result, the data of the two machines are eventually inconsistent, and some data updates are lost.
Here, I believe readers will have a question. After step 4, the data becomes (id: 1, name: name4), then the statement written in step 3 is update t20200709 set name=' name3' where id = 1 and name='name2',. When the synchronization is resumed in step 5, this statement should not be successfully executed on DA. After all, the Where conditions do not match. And in the Binlog generated by DP, the Where condition of the SQL statement is indeed recorded, and the SQL statement in step 3 should not be replayed successfully from any point of view.
# UPDATE `test`.`t20200709` # # WHERE # @ 1room1 / * INT meta=0 nullable=0 is_null=0 * / # @ 2roomroomname2' / * VARSTRING meta=255 nullable=1 is_null=0 * / # SET # @ 1room1 / * INT meta=0 nullable=0 is_null=0 * / # @ 2roomroomname3' / * VARSTRING * / # at 684315240
So is this problem the Bug of MySQL itself, or is it the normal performance of MySQL under some special parameters or conditions? For this problem, this paper will give a detailed explanation and analysis of this problem.
2. Playback of RelayLog in Row format
2.1 BEFOR IMAGE & & AFTER IMAGE & & binlog_row_image parameters
Before we finally explain the question originally raised in this article, we need to take a look at how RelayLog is replayed. In general, when there is a DML statement to change the data in the database, Binlog will record the event description information, BEFORE IMAGE, AFTER IMAGE and other information. Here is a concept BEFORE IMAGE and AFTER IMAGE need to be introduced first:
1. BEFORE IMAGE: the pre-mirror image, which is what the data looked like before modification.
2. AFTER IMAGE: back image, which shows what the data looks like after modification.
For ease of understanding, an example of Binlog is posted here. Suppose there is a table t20200709, and the data in the table is as follows:
Mysql > select * from t20200709; +-+-+ | id | name | +-+-+ | 1 | name4 | +-+-- + 1 rows in set (0.00 sec)
Then execute the SQL statement update t20200709 set name = 1 where id = 1
Mysql > update t20200709 set name = 1 where id = 1; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0
Then take a look at the records in Binlog:
# 200715 17:28:28 server id 15218 end_log_pos 4034114356 CRC32 0xe4dedec0 Update_rows: table id 4034114356 flags: STMT_END_F # UPDATE `test`.`t20200709` # WHERE # @ 1room1 / * INT meta=0 nullable=0 is_null=0 * / # @ 2roomnamame4' / * VARSTRING (255i) meta=255 nullable=1 is_null=0 * / # SET # # @ 1room1 / * INT meta=0 nullable=0 is_null=0 * / # # @ 2 / * VARSTRING (255) meta=255 nullable=1 is_null=0 * / # at 400
As you can see, before the modification, the value of the name field is name4, which is indicated by the Where condition @ 2 parameters name 4'in Binlog, while the value of the modified name is'1', which is specified in Binlog. So BEFORE IMAGE is the WHERE to SET part of Binlog. And AFTER IMAGE is the part after SET.
So when DELETE,UPDATE and INSERT statements are recorded in Binlog, are there both BEFORE IMAGE and AFTER IMAGE? In fact, not all DML event types have two IMAGE. See figure 2-2 to see that only UPDATE statements have both BEFORE IMAGE and AFTER IMAGE.
BEFOR IMAGE and AFTER IMAGE record all column changes by default, thus causing the content of the Binlog to become very large. So is there a parameter to control the behavior of IMAGE (for BEFOR IMAGE and AFTER IMAGE hereinafter referred to as IMAGE)? MySQL 5.7.A new parameter binlog_row_image was introduced after that.
Parameter description:
Binlog_row_image: https://dev.mysql.com/doc/refman/5.7/en/replication-options-binary-log.html#sysvar_binlog_row_image
Used to control the behavior of IMAGE. The binlog_row_image parameter has three values:
1. Full:Log all columns in both the before image and the after image. Changes to the values of all columns are recorded in IMAGE. The system defaults to full.
2. Minimal:Log only those columns in the before image that are required to identify the row to be changed; log only those columns in the after image where a value was specified by the SQL statement, or generated by auto-increment. BEFOR IMAGE records only the columns that uniquely identify the data, such as primary keys, unique keys, and so on. AFTER IMAGE records only the changed columns. As you can see, minimal can effectively reduce the size of Binlog.
3. Noblob:Log all columns (same as full), except for BLOB and TEXT columns that are not required to identify rows, or that have not changed. The behavior for all other columns is the same as the full parameter. But for BLOB and TEXT, no records are made if the rows of data can not be identified or there are changes.
Parameter description:
BLOB: https://dev.mysql.com/doc/refman/5.7/en/blob.html
TEXT: https://dev.mysql.com/doc/refman/5.7/en/blob.html
You can see that binlog_row_image can effectively control the size of Binlog, but if you want to ensure data consistency, the best value is set to full.
2.2 slave_rows_search_algorithms parameters
The content related to IMAGE and binlog_row_image was mentioned earlier. At the beginning of this section, we will mainly introduce the lookup logic for the replayed records when the Relay Log is replayed. For DELETE and UPDATE operations, the records in the database need to be retrieved first to determine the data that needs to be replayed by Binlog. If there is no primary key or unique key on the table of the slave library, you need to perform a full table scan of all data based on each row record BEFOR IMAGE. In most cases this overhead is huge, resulting in significant delays between the slave library and the master library. The parameter slave_rows_search_algorithms has been provided since MySQL 5.6,
Parameter description: slave_rows_search_algorithms: https://dev.mysql.com/doc/refman/5.7/en/replication-options-replica.html#sysvar_slave_rows_search_algorithms
Used to control the retrieval behavior of records when Relay Log performs a replay. The basic idea is to collect the BEFOR IMAGE information of each record, and then retrieve the corresponding record in the replayed table according to the BEFOR IMAGE information. According to MySQL documents, there are several ways to retrieve data:
1. INDEX_SCAN
2. TABLE_SCAN
3. HASH_SCAN
In the above three ways, two or two groups can be combined to assign values to the slave_rows_search_algorithms parameter. The MySQL documentation also gives the following instructions:
The default value is INDEX_SCAN,TABLE_SCAN, which means that all searches that can use indexes do use them, and searches without any indexes use table scans.
To use hashing for any searches that do not use a primary or unique key, set INDEX_SCAN,HASH_SCAN. Specifying INDEX_SCAN,HASH_SCAN has the same effect as specifying INDEX_SCAN,TABLE_SCAN,HASH_SCAN, which is allowed.
Do not use the combination TABLE_SCAN,HASH_SCAN. This setting forces hashing for all searches. It has no advantage over INDEX_SCAN,HASH_SCAN, and it can lead to "record not found" errors or duplicate key errors in the case of a single event containing multiple updates to the same row, or updates that are order-dependent.
1. INDEX_SCAN,TABLE_SCAN: you can see that by default, if the INDEX_SCAN,TABLE_SCAN has a primary key or unique key, it queries the data and replays the AFTER IMAGE through the primary key or unique key. If there is no primary key or unique key, this is done through a secondary index. If there is nothing, a full table scan is used.
2. INDEX_SCAN,HASH_SCAN: when the table has a primary key or unique key, the INDEX_SCAN,HASH_SCAN configuration also uses the primary key or unique key to locate the data. The HASH_SCAN method is used when the table has a secondary index or no index at all.
You can see that when Slave retrieves data that needs to be replayed, the priority of the three retrieval methods is Index Scan > Hash Scan > Table Scan.
I believe readers will have two questions here:
1. What is the principle of Hash Scan? How is it different from Table Scan and Index Scan? Hash scan over index is also mentioned in the document. What does this have to do with Index?
two。 As mentioned earlier, when a table does not have a primary key or a unique key, the data is located through a secondary index. Assuming that there are N secondary indexes in the table (including single-column indexes and federated indexes), which secondary index will be selected?
2.3 Hash Scan & & Table Scan & & Index Scan implementation
Analysis of the MySQL source code shows that the final decision on which retrieval method to use is implemented in the function Rows_log_event::decide_row_lookup_algorithm_and_key.
9745 void 9746 Rows_log_event::decide_row_lookup_algorithm_and_key () 9747 {9748... 9781 / * competitive or UK = > use LOOKUP_INDEX_SCAN * / 9782 this- > masks keyword index = search_key_in_table (table, cols, (PRI_KEY_FLAG | UNIQUE_KEY_FLAG)) 9783 if (this- > m_key_index! = MAX_KEY) 9784 {9785 DBUG_PRINT ("info", ("decide_row_lookup_algorithm_and_key: decided-INDEX_SCAN")); 9786 this- > ROW_LOOKUP_INDEX_SCAN; 9787 goto end 9788}...... 9790 TABLE_OR_INDEX_HASH_SCAN:...... 9808 TABLE_OR_INDEX_FULL_SCAN:...... 9827 end:......
Line 9782 will first retrieve whether there are primary keys and unique keys in the table. Then TABLE_OR_INDEX_HASH_SCAN and TABLE_OR_INDEX_FULL_SCAN decide which retrieval method to use in the end. In the do_apply_event function, the function is called based on the result of decide_row_lookup_algorithm_and_key:
11286 switch (m_rows_lookup_algorithm) 11287 {11288 case ROW_LOOKUP_HASH_SCAN: 11289 do_apply_row_ptr= & Rows_log_event::do_hash_scan_and_update; 11290 break; 11291 11292 case ROW_LOOKUP_INDEX_SCAN: 11293 do_apply_row_ptr= & Rows_log_event::do_index_scan_and_update; 11294 break 11295 11296 case ROW_LOOKUP_TABLE_SCAN: 11297 do_apply_row_ptr= & Rows_log_event::do_table_scan_and_update; 11298 break; 11299 11300 case ROW_LOOKUP_NOT_NEEDED: 11301 DBUG_ASSERT (get_general_type_code () = = binary_log::WRITE_ROWS_EVENT) 11302 11303 / * No need to scan for rows, just apply it * / 11304 do_apply_row_ptr= & Rows_log_event::do_apply_row; 11305 break; 11306 11307 default: 11308 DBUG_ASSERT (0); 11309 error= 1; 11310 goto AFTER_MAIN_EXEC_ROW_LOOP; 11311 break; 11312}
You can see:
1. Do_hash_scan_and_update: corresponds to the hash_scan method.
2. Do_index_scan_and_update: corresponds to the index_scan method.
3. Do_table_scan_and_update: corresponds to the table_scan method.
Next, we will introduce what these three functions have done.
2.3.1 do_hash_scan_and_update
The do_hash_scan_and_update function mainly implements the function of Hash Scan to retrieve data. In terms of implementation, it can be divided into H-> Hash Scan and Hi-- > Hash over Index. First, let's take a look at the implementation of Hash Scan. Figure 2-5 shows the implementation logic of Hash Scan.
You can see that the BI in Binlog is processed into a Hash table on Slave. Because there is no suitable index to use, every time a record is obtained using a full table scan, a hash value is calculated based on the record value, and then matched in the Hash table of BI. If the BI is matched, the record in the Hash table is replayed and deleted.
If there is an index on the id column in the test06 table, the use of Slave playback will use Hi-- > Hash over index. The implementation logic of the Hash over Index approach (hereinafter referred to as Hi) is shown in figure 2-6.
You can see that if replayed through Hi, a m_distinct_keys structure is generated for the secondary index used, which stores all the deduplicated values of the index in the BI. Then the test06 table on Slave is traversed on the secondary index through each value in m_distinct_keys, and the traversal records are compared with the results in m_hash and the playback logic is executed.
Ps: for another confusing feature of Hash Scan, readers can refer to this article [Technology sharing | reward for HASH_SCAN BUG's confusing behavior]
2.3.2 do_index_scan_and_update
The Index Scan method retrieves the data that needs to be replayed on the Slave through the index. The way to retrieve data through index can be divided into:
1. Retrieve data through the primary key / unique key index.
two。 Data is retrieved through a secondary index.
The do_index_scan_and_update function is called when the data is retrieved through the primary key or unique key index, and the record is returned directly through the primary key / unique key index in the function logic and the Binlog is replayed.
When retrieving data through the secondary index, the data returned by the secondary index is compared with each record in the BI, and if consistent, the Binlog is replayed.
At this point, it can be found that there are some differences in the implementation logic for primary / unique keys and secondary indexes under Index Scan. For primary / unique keys, the records indexed will not be compared with each column in BI, while the data obtained by the secondary index will be compared with each column in BI, and will not be replayed and reported as an error if inconsistent.
2.3.3 do_table_scan_and_update
The implementation of Table Scan is relatively simple, and if there is no index available, you can only get each row of data through a full table scan and compare it with each row in BI. So if the data on Slave is inconsistent with the data on Master, the error will also be reported as shown in figure 2-9. For a more specific implementation of Table Scan, readers can refer to the do_table_scan_and_update function in the MySQL source sql/log_event.cc file, which is not expanded too much here.
2.3.4 Summary
At this point, we can answer the question raised earlier in this article:
What is the principle of the Hash scan method? How is it different from Table scan and Index scan? What does Hash scan over index have to do with Index when it is also mentioned in the document?
As you can see, the principle of Hash Scan is to put the contents of each row of BI into a Hash table. If a secondary index (both Hash scan over index) can be used, an additional Hash structure is generated for the value of the secondary index in the BI, and the deduplicated values of the secondary index column in the BI are put into the Hash structure. After that, no matter whether the data is obtained by full table scan or index, the data in the BI will be located through the Hash structure. After fetching each row in the table, Table Scan and Index Scan need to look up and compare with the records in BI (no comparison is made when there is a primary key or unique key), and each row of BI does not generate a structure similar to Hash, so it belongs to O (n ^ 2) in terms of time complexity and efficiency of the algorithm. After obtaining a record, Hash Scan also needs to look up records in the Hash structure generated by BI, but the efficiency for the lookup of Hash structure is O (1), so it can be ignored. It can be seen that Hi and Ht are more efficient than Table Scan and Index Scan without primary keys.
At the same time, you can also answer the question at the beginning of this article why a column of values in the current table is no longer consistent with the record in BI, and the operation in Binlog will be replayed. The reason is that in the default INDEX_SCAN,TABLE_SCAN mode, tables with primary / unique keys do not compare whether the records in BI are consistent with the retrieved data.
2.4 selection of secondary index in Hash Scan Over Index & & Index Scan
It was mentioned earlier that when there is a secondary index, both Hash Scan and Index Scan will select the secondary index for scanning. If there is more than one secondary index in the table, which one will MySQL choose? Through the source code analysis, we are surprised to find that when the binlog_row_image parameter is Full, if there are multiple secondary indexes in the table, MySQL will choose to replay with the first index by default. In the decide_row_lookup_algorithm_and_key function, in addition to determining which way to retrieve the data (such as using Hash Scan or Table Scan), it also determines which index to use later.
Figure 2-10 shows the logic when selecting a secondary index. You can find that if, during the traversal, it is found that all the columns are key in BI, then this key will be used. To give an example, the table structure and data in test06 are as follows:
* * 1. Row * * Table: test06 Create Table: CREATE TABLE `test06` (`id` int (11) NOT NULL, `name` varchar (255) DEFAULT NULL, `c1` int (11) DEFAULT NULL, KEY `k1` (`id`), KEY `k2` (`id`, `name`), KEY `k3` (`c1`) `name`) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.13 sec) mysql > select * from test06 +-+ | id | name | C1 | +-+ | 2582 | name3 | 1 | 2582 | name4 | 1 | 1 | name1 | 0 | 1 | name2 | 0 | 1 | name3 | 0 | +-+ 5 rows in set (0.00 sec)
Execute SQL on Master, and the execution plan on Master is as follows:
Delete from test06 where id = 1 and name = 'name3' and c1y0; mysql > explain delete from test06 where id = 1 and name =' name3' and c1fol0 +- -+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +- -+ | 1 | DELETE | test06 | NULL | range | K1 K2PowerK3 | K2 | 772 | const Const | 1 | 100.00 | Using where | + -+-+ 1 row in set (0.00 sec)
As you can see, the optimizer chose K2 as the federated index on Master. Track the progress of Slave through GDB, breaking point at line 9733 of log_event.cc:
9714 if (key_type & MULTIPLE_KEY_FLAG & & table- > s-> keys) 9715 {9716 DBUG_PRINT ("debug", ("Searching for K.")); 9717 for (key=0,keyinfo= table- > key_info; 9718 (key)
< table->S-> keys) & & (res = = MAX_KEY) 9719 key++ Keyinfo++) 9720 {9721 / * 9722-Skip innactive keys 9723-Skip unique keys without nullable parts 9724-Skip indices that do not support ha_index_next () e.g. Full-text 9725-Skip primary keys 9726 * / 9727 if (! (table- > s-> keys_in_use.is_set (key)) | 9728 ((keyinfo- > flags & (HA_NOSAME | HA_)) NULL_PART_KEY)) = = HA_NOSAME) | | 9729! (table- > file- > index_flags (key) 0, true) & HA_READ_NEXT) | | 9730 (key = = table- > s-> primary_key)) 9731 continue 9732 9733 res= are_all_columns_signaled_for_key (keyinfo, bi_cols)? 9734 key: MAX_KEY; 9735 9736 if (res < MAX_KEY) 9737 DBUG_RETURN (res); 9738} 9739 DBUG_PRINT ("debug", ("Not all columns signaled for K.")); 9740}
You can observe that the value of m_key_index is 0, and the value of the keyinfo variable is:
(gdb) print * keyinfo $4 = {key_length = 4, flags = 0, actual_flags = 0, user_defined_key_parts = 1, actual_key_parts = 1, unused_key_parts = 0, usable_key_parts = 1, block_size = 0, algorithm = HA_KEY_ALG_UNDEF, {parser = 0x0, parser_name = 0x0}, key_part = 0x7f2f4c015a00, name = 0x7f2f4c012bb1 "K1", rec_per_key = 0x7f2f4c012bc0, m_in_memory_estimate =-1, rec_per_key_float = 0x7f2f4c012bf8 Handler = {bdb_return_if_eq = 0}, table = 0x7f2f4c92d1a0, comment = {str = 0x0, length = 0}}
Next, delete the index K1, and then look at the values of m_key_index and keyinfo.
(gdb) print * keyinfo $7 = {key_length = 772, flags = 64, actual_flags = 64, user_defined_key_parts = 2, actual_key_parts = 2, unused_key_parts = 0, usable_key_parts = 2, block_size = 0, algorithm = HA_KEY_ALG_UNDEF, {parser = 0x0, parser_name = 0x0}, key_part = 0x7f2f4c92b680, name = 0x7f2f4c92e7d1 "K2", rec_per_key = 0x7f2f4c92e7d8, m_in_memory_estimate =-1, rec_per_key_float = 0x7f2f4c92e808 Handler = {bdb_return_if_eq = 0}, table = 0x7f2f4ca9fd90, comment = {str = 0x0, length = 0}}
You can see that after K1 is deleted, the index K2 is selected on Slave, which is consistent with the index selected in the execution plan on Master. As you can see from the previous source code analysis and debug trace, when MySQL replays the data in Slave (without primary and unique keys), the index selected is the first index that all columns exist in BI. Therefore, it is possible that the selected index on Slave is not optimal, resulting in significant latency for Slave and Master.
III. Summary
At this point, several questions raised in the previous article are basically clear, and the following points can be summarized:
1. In the case of a primary key or a unique key, Slave replays Binlog does not compare whether each column of the retrieved record is the same as BI, so if there is data inconsistency between Slave and Master, it will directly overwrite the data of Slave without reporting an error.
two。 In the case of no primary key or unique key, the execution efficiency of Hash Scan / Hash Scan Over Index is theoretically higher than that of Table Scan and Index Scan.
3. In the absence of a primary or unique key, the secondary index selected by Slave is the first index that all columns exist in BI, not necessarily the index selected by the Master execution plan.
The above is how to solve the data loss caused by master-slave double writing in MySQL. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.