MySQL 8.0.20 MGR data migration process and considerations 04/02 Update SLTechnology News&Howtos

MySQL 8.0.20 MGR data migration process and considerations

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

1. Background

two。 The migration form file is too large

Because some of the data comes from text files, a single file is as large as 40G, and the original table is MyISAM storage engine, because MGR only supports transaction engine InnoDB.

Therefore, you need to modify the text file header to build the table statement and split the file, import it in parallel, and modify the large file and split the file using the following two softwares:

EmEditor, you can open very large files.

Windows Unix enhancement tool.

3. Parallel import encountered a problem

The second stage: try to open the autocommit mode and find that because you keep writing binlog and data files, the efficiency is even worse.

The third stage: split MGR, transfer files to two servers, shut down binlog, import separately, very efficient, split 170 million, 40G data into 20

File, open 20 parallel imports, two machines import in parallel, and migrate all MySQL files to the server SSD disk, all data imports can be completed in 40 minutes.

4. The import process encounters the problem of parameter limitation of MGR and MySQL: group_replication_transaction_size_limit # has a maximum value of 2147483647, which is approximately 2G. When composing MGR to import or update a large number of data in a single transaction, you need to consider the influence of this parameter. It is possible that the final phase fails because the parameter is set too small, but too large transactions are really not friendly to MGR, and nodes confirm with each other that they consume a lot of network bandwidth. If the max_binlog_cache_size # transaction is too large, you need to increase this parameter accordingly. According to the actual test, the parameter is about 30004G for 10 million rows of data. It is not recommended to set this parameter too large in the official document. The maximum 4G5 is recommended. Resolution of MGR data inconsistency due to the need to import MyISAM

After the final data migration, disable_storage_engines was temporarily disabled due to the import of MyISAM engine table, which led to the launch of MGR.

An error was reported for operations that are not supported by MGR:

ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.

The above errors are reported. All errors in MGR that violate the MGR restrictions report the above errors. They will not be specified due to detailed reasons, such as the use of operations on the MyISAM table.

After the table without primary key and unique key is created, the above error will be reported when the data is inserted.

The process for resolving MGR inconsistencies is as follows:

1. Check the status of the cluster, identify the failed node SELECT * FROM PERFORMANCE_SCHEMA.REPLICATION_GROUP_MEMBERS;#, view the status of all nodes in the cluster, and find specific Error or recovering nodes. two。 Check the fault node error log#, view the error log, and determine the fault gtid,position3. Analysis of the current read and write node problems binlog# mysqlbinlog command analysis, find the fault execution statement, clear the cause of the fault. 4. Check the size of the specific failure table. Status (1) determines the table size and whether it is modified frequently. If the large table is modified frequently, consider using backup to rebuild the failure node. (2) if the table is not large or does not change often, the change can clearly predict the period of time, you can consider the fault node reset master, and then set gtid_purged or set gtid_next to the fault gtid mode. If you can copy to the current gtid of the read and write node normally. Then export in a constant period of time. If you continue to report an error, you will continue to check whether it is the fault table. If you continue to skip, you will know that you can normally trace the data to the current gtid of the read-write node, record the show master status replication point of the fault node, temporarily set the fault read_only and super_read_only to off, import the fault node, and then reset master or set gtid_next to the gtid recorded by show master status, so that the replication can continue and can be repaired.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.