In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Write at the front:
Recently, I have been doing tests related to traditional master-slave replication, thinking about a lot of problems that may occur after the online master-slave replication architecture, and then pertinently setting these failures, and then thinking about how to ensure the availability of the business. Or under the premise of having as little impact on the business as possible, it can be regarded as a small gain. Now, when the master-slave replication failure occurs, the author The sql_slave_skip_counte variable used when the point of failure needs to be skipped is recorded and described.
Sql_slave_skip_counter introduction:
Taken from the official explanation of MySQL (it is strongly recommended to read the original English text. The Chinese version, is the author's own understanding, can only say that benevolent people have different opinions)
SET GLOBAL sql_slave_skip_counter Syntax:
SET GLOBAL sql_slave_skip_counter = N
This statement skips the next N events from the master. This is useful for recovering from replication stops caused by a statement.
Skip N events. Note: event, not transactions, are equivalent only in transactions that consist of a single statement.
For example, a transaction consists of multiple EVENT, in the case of BEGIN;INSERT;UPDATE;DELETE;COMMOIT;, the two are by no means equal
This statement is valid only when the slave threads are not running. Otherwise, it produces an error.
When using this statement, it is important to understand that the binary log is actually organized as a sequence of groups known as event groups. Each event group consists of a sequence of events.
For transactional tables, an event group corresponds to a transaction.
For transaction tables, one event group corresponds to one transaction
Or nontransactional tables, an event group corresponds to a single SQL statement.
For non-transactional tables, one event group corresponds to one SQL
When you use SET GLOBAL sql_slave_skip_counter to skip events and the result is in the middle of a group, the slave continues to skip events until it reaches the end of the group. Execution then starts with the next event group
When you skip event, if the value of N is in event group, then slave will continue to skip event until you skip this event group, starting with the next event group.
For cases where sql_slave_skip_counter is used for transaction tables:
1. Skip 1032 replication error (update/delete error)
Skip transactions consisting of a single SQL:
Delete two pieces of data artificially on the Slave host:
DELETE FROM `edusoho_ e`.`t1` WHERE `id` ='9'
DELETE FROM `edusoho_ e`.`t1` WHERE `id` = '11'
On the other hand, Master will report an error when changing the above two records, resulting in replication interruption:
INSERT INTO `edusoho_ e`.`t1` (`xname`, `address`, `hobby`) VALUES ('Sun Quan','Wu State', 'Sister')
UPDATE `edusoho_ e`.`t1`SET xname=' Game 'WHERE id=7
UPDATE `edusoho_ e`.`t1` SET age=40 WHERE id=11; # reported an error
DELETE FROM `edusoho_ e`.`t1` WHERE age=40; # reported an error
INSERT INTO `edusoho_ e`.`t1` (`xname`, `address`, `hobby`) VALUES ('Cao Pi', 'State of Wei', 'Zhen Ji')
DELETE FROM `edusoho_ e`.`t1` WHERE id=1
UPDATE `edusoho_ e`.`t1` SET hobby='Games' WHERE id=3
When Slave checks the status of master-slave replication, it will find the error message:
Mysql > show slave status\ G
* * 1. Row *
Read_Master_Log_Pos: 2176
Exec_Master_Log_Pos: 874
Last_Errno: 1032
Last_Error: Could not execute Update_rows event on table edusoho_e.t1; Can't find record in't 1 million, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000002, end_log_pos 1127
Slave_IO_Running: Yes
Slave_SQL_Running: No
See what position has done on the Master host:
Mysql > show binlog events in 'mysql-bin.000002' from 874
+-- +
| | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | |
+-- +
| | mysql-bin.000002 | 874 | Query | 2 | 956 | BEGIN |
| | mysql-bin.000002 | 956 | Table_map | 2 | 1017 | table_id: 213 (edusoho_e.t1) | |
| | mysql-bin.000002 | 1017 | Update_rows | 2 | 1127 | table_id: 213 flags: STMT_END_F | |
| | mysql-bin.000002 | 1127 | Xid | 2 | 1158 | COMMIT / * xid=437 * / | |
| | mysql-bin.000002 | 1158 | Query | 2 | 1240 | BEGIN |
| | mysql-bin.000002 | 1240 | Table_map | 2 | 1301 | table_id: 213 (edusoho_e.t1) | |
| | mysql-bin.000002 | 1301 | Delete_rows | 2 | 1407 | table_id: 213 flags: STMT_END_F | |
| | mysql-bin.000002 | 1407 | Xid | 2 | 1438 | COMMIT / * xid=446 * / | |
| | mysql-bin.000002 | 1438 | Query | 2 | 1520 | BEGIN |
| | mysql-bin.000002 | 1520 | Table_map | 2 | 1581 | table_id: 213 (edusoho_e.t1) | |
| | mysql-bin.000002 | 1581 | Write_rows | 2 | 1644 | table_id: 213 flags: STMT_END_F | |
| | mysql-bin.000002 | 1644 | Xid | 2 | 1675 | COMMIT / * xid=455 * / | |
| | mysql-bin.000002 | 1675 | Query | 2 | 1757 | BEGIN |
| | mysql-bin.000002 | 1757 | Table_map | 2 | 1818 | table_id: 213 (edusoho_e.t1) | |
| | mysql-bin.000002 | 1818 | Delete_rows | 2 | 1880 | table_id: 213 flags: STMT_END_F | |
| | mysql-bin.000002 | 1880 | Xid | 2 | 1911 | COMMIT / * xid=464 * / | |
| | mysql-bin.000002 | 1911 | Query | 2 | 1993 | BEGIN |
| | mysql-bin.000002 | 1993 | Table_map | 2 | 2054 | table_id: 213 (edusoho_e.t1) | |
| | mysql-bin.000002 | 2054 | Update_rows | 2 | 2145 | table_id: 213 flags: STMT_END_F | |
| | mysql-bin.000002 | 2145 | Xid | 2 | 2176 | COMMIT / * xid=473 * / | |
+-- +
Skip the first Update_rows event replication error in Slave:
Mysql > set global sql_slave_skip_counter=1
Mysql > start slave sql_thread
Mysql > show slave status\ G
Slave_IO_Running: Yes
Slave_SQL_Running: No
Exec_Master_Log_Pos: 1158
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table edusoho_e.t1; Can't find record in't 1 million, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000002, end_log_pos 1407
Successfully skipped the first events group
Continue to skip the second Delete_rows event replication error in Slave:
Mysql > set global sql_slave_skip_counter=1
Mysql > start slave sql_thread
Mysql > show slave status\ G
* * 1. Row *
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Last_SQL_Errno: 0
Last_SQL_Error:
Successfully skipped the second events group
Note:
Although the failure of master-slave replication has been successfully skipped, it is only temporarily restored to the normal state of master-slave replication, and the missing data of Slave needs to be made up as soon as possible, otherwise the changes made by Master to data that does not exist in Slave will still repeatedly lead to master-slave replication failure. The author thinks that if the difference in the amount of data is not too great, you can consider using pt-table-checksum and pt-table-sync tools to restore. If you have a large amount of data and there are many differences in data, it is recommended to redo Slave, because the tool will lock the table, which will have a certain impact on the online business. Please consider the specific situation.
Skip transactions consisting of multiple SQL (event):
Delete a piece of data artificially on the Slave host:
DELETE FROM `edusoho_ e`.`t1` WHERE `id` ='7'
Generate a transaction consisting of multiple SQL on the Master host:
BEGIN
DELETE FROM `edusoho_ e`.`t1` WHERE `id` ='7'
INSERT INTO `edusoho_ e`.`t1` (`xname`, `address`, `hobby`) VALUES ('lazy','I don't know', 'eat after sleep')
COMMIT
Because id=7 data has been deleted on the Slave host, when Slave checks the master-slave replication status, it will find an error message:
Mysql > show slave status\ G
* * 1. Row *
Read_Master_Log_Pos: 7219
Exec_Master_Log_Pos: 6840
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table edusoho_e.t1; Can't find record in't 1 million, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000002, end_log_pos 7049
See what position has done on the Master host:
Mysql > show binlog events in 'mysql-bin.000002' from 6840
+-- +
| | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | |
+-- +
| | mysql-bin.000002 | 6840 | Query | 2 | 6922 | BEGIN |
| | mysql-bin.000002 | 6922 | Table_map | 2 | 6983 | table_id: 213 (edusoho_e.t1) | |
| | mysql-bin.000002 | 6983 | Delete_rows | 2 | 7049 | table_id: 213 flags: STMT_END_F | |
| | mysql-bin.000002 | 7049 | Table_map | 2 | 7110 | table_id: 213 (edusoho_e.t1) | |
| | mysql-bin.000002 | 7110 | Write_rows | 2 | 7188 | table_id: 213 flags: STMT_END_F | |
| | mysql-bin.000002 | 7188 | Xid | 2 | 7219 | COMMIT / * xid=825 * / | |
+-- +
As you can see, this transaction is made up of two SQL (event)
What if you use sql_slave_skip_counter=N to skip transactions consisting of multiple SQL?
Mysql > set global sql_slave_skip_counter=1
Mysql > start slave sql_thread
Mysql > show slave status\ G
* * 1. Row *
Read_Master_Log_Pos: 7219
Exec_Master_Log_Pos: 7219
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Last_Errno: 0
Last_Error:
Found no problem, when using sql_slave_skip_counter to skip transactions composed of multiple SQL (event), you can see from the show binlog events executed on Master that if you just skip the error SQL statement, then the Exec_Master_Log_Pos value should be 7110, but now it is 7219, which means that the entire event group has been skipped, but 7110 of the SQL data is what we need, so it is the same as a transaction composed of a single SQL. Although the master-slave replication state is restored, the data is still in an inconsistent state. We should seize the time to make up the data or redo the Slave.
2. When a transaction consists of multiple SQL (event), only one event is skipped, not an event group:
Delete a piece of data artificially on the Slave host:
DELETE FROM `edusoho_ e`.`t1` WHERE `id` = '17'
Generate a transaction consisting of multiple SQL on the Master host:
BEGIN
DELETE FROM `edusoho_ e`.`t1` WHERE `id` = '17'
INSERT INTO `edusoho_ e`.`t1` (`xname`, `address`, `hobby`) VALUES ('who am I','I don't know', 'eat after sleep')
COMMIT
Because id=17 data has been deleted on the Slave host, when Slave checks the master-slave replication status, it will find an error message:
Exec_Master_Log_Pos: 120
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table edusoho_e.t1; Can't find record in't 1 million, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000004, end_log_pos 341
See what position has done on the Master host:
Mysqlbinlog-v-base64-output=decode-start-position=120 mysql-bin.000004
/ *! 50530 SET @ @ SESSION.PSEUDO_SLAVE_MODE=1*/
/ *! 40019 SET @ @ session.max_insert_delayed_threads=0*/
/ *! 50003 SET @ OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/
DELIMITER / *! * /
# at 120
# 190507 13:52:05 server id 2 end_log_pos 202 CRC32 0x0ca0c280 Query thread_id=3 exec_time=0 error_code=0
SET timestamp 1557208325
SET @ @ session.pseudoclinic thread readership idler 3max session /
SET @ @ session.foreign_key_checks=1, @ @ session.sql_auto_is_null=0, @ @ session.unique_checks=1, @ @ session.
SET @ @ session.sqlcards modewords 1073741824Universe
SET @ @ session.auto_increment_increment=2, @ @ session.
/ *!\ C utf8 * /! * /
SET @ @ session. Session. Session setting setting clientmakers 33 minutes. Session. Collationalization connections 33 minutes.
SET @ @ session. Session. LCC timetables namespace.
SET @ @ session.collationalization databases
BEGIN
/ *! * /
# at 202
# 190507 13:52:05 server id 2 end_log_pos 263CRC32 0x20d2e89d Table_map: `edusoho_ e`.`t1` mapped to number
# at 263
# 190507 13:52:05 server id 2 end_log_pos 341 CRC32 0xbec6fd45 Delete_rows: table id 216 flags: STMT_END_F
# DELETE FROM `edusoho_ e`.`t1`
# WHERE
# @ 1: 17
# @ 2lazy
# @ 3Qing'I don't know'
# @ 4room1
# @ 5 eat 'eat after sleep'
# @ 6x18
# at 341
# 190507 13:52:05 server id 2 end_log_pos 402 CRC32 0xa37bc5c9 Table_map: `edusoho_ e`.`t1` mapped to number
# at 402
# 190507 13:52:05 server id 2 end_log_pos 483 CRC32 0x0d774707 Write_rows: table id 216 flags: STMT_END_F
# INSERT INTO `edusoho_ e`.`t1`
# SET
# @ 1o21
# @ 2 who am I?
# @ 3Qing'I don't know'
# @ 4room1
# @ 5 eat 'eat after sleep'
# @ 6x18
# at 483
# 190507 13:52:05 server id 2 end_log_pos 514 CRC32 0x8c333b30 Xid = 411
Com _ MIT _ blank /
DELIMITER
# End of log file
ROLLBACK / * added by mysqlbinlog * /
/ *! 50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/
/ *! 50530 SET @ @ SESSION.PSEUDO_SLAVE_MODE=0*/
As you can see, the green part is what we need to skip, and the second event is what we need to keep.
At this time, you need to use the variable slave_exec_mode. As for the detailed introduction of slave_exec_mode, please refer to the official website of MySQL.
Mysql > set global slave_exec_mode='IDEMPOTENT'
Mysql > start slave sql_thread
Mysql > show slave status\ G
* * 1. Row *
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Exec_Master_Log_Pos: 514
Check the edusoho_ e.t1 table on Slave. The data of id=21 has passed. At this time, the data is in a consistent state.
3. Skip primary key conflict 1062 error (Duplicate entry):
Insert an id value on the Slave primary key first:
INSERT INTO `edusoho_ e`.`t1` (`id`, `xname`, `address`, `hobby`, `age`) VALUES (19pr. 'gadget', 'Ming Dynasty', 'Queen', '25')
Because Slave has occupied the primary key value id=19 to be automatically generated by Master, the Slave host reports an error:
INSERT INTO `edusoho_ e`.`t1` (`id`, `xname`, `address`, `hobby`, `age`) VALUES ('Zhu Di', 'Ming Dynasty', 'Emperor','36')
Check the status of Slave master-slave replication and find that a master-slave replication error has occurred:
Mysql > show slave status\ G
* * 1. Row *
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Errno: 1062
Last_Error: Could not execute Write_rows event on table edusoho_e.t1; Duplicate entry '19' for key' PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000002, end_log_pos 7425
Exec_Master_Log_Pos: 7219
View Master binlog:
Mysql > show binlog events in 'mysql-bin.000002' from 7219
+-- +
| | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | |
+-- +
| | mysql-bin.000002 | 7219 | Query | 2 | 7301 | BEGIN |
| | mysql-bin.000002 | 7301 | Table_map | 2 | 7362 | table_id: 213 (edusoho_e.t1) | |
| | mysql-bin.000002 | 7362 | Write_rows | 2 | 7425 | table_id: 213 flags: STMT_END_F | |
| | mysql-bin.000002 | 7425 | Xid | 2 | 7456 | COMMIT / * xid=893 * / | |
+-- +
Think about:
Because Slave already exists, if this data is deleted on the Slave host, will Slave synchronize it directly? (the answer is: no. Need to restart Slave thread):
DELETE FROM `edusoho_ e`.`t1` WHERE `id` = '19'
Mysql > stop slave
Mysql > start slave user='repliter' password='123456'
When verifying, it is found that the data has been synchronized.
Out of the question:
The above is the author for a single SQL transactions, multiple SQL transactions, and in these single / multiple SQL transactions, artificially set 1032 and 1062 replication errors and solutions, as well as the respective usage and skip scope of sql_slave_skip_counter and slave_exec_mode, of course, the author only did the deployment test before the online application, and did not pass any actual test. On the one hand, only for the majority of colleagues to do a reference; on the other hand, to record the author's own experience and ideas for problem-solving to make a summary, when the problem really occurs, there is a direction for reference, and will not be in a hurry, do not know what to do, so, there are mistakes and understanding is not in place, please leave a message below to correct, grateful!
Also, what the author does is to do sql_slave_skip_counter and slave_exec_mode tests for transaction tables. For non-transaction tables, the use of sql_slave_skip_counter and slave_exec_mode will be slightly different. Please do your own Baidu.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.