Copy the information record table | have a comprehensive understanding of mysql system library 07/19 Update SLTechnology News&Howtos

Copy the information record table | have a comprehensive understanding of mysql system library

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

In the last issue of "time zone information record table | omni-directional understanding of the mysql system library", we introduced in detail the time zone information record table in the mysql system library. This issue we will bring you the seventh article in the series "copy the information record table | know the mysql system library in an all-round way". Please follow us to start the systematic learning journey of the mysql system library.

1. Overview of replication information table

The replication information table is used to save binary log events forwarded from the master library to the slave library and to record information about the current status and location of the relay log during the slave library copying data from the master library. There are three types of logs, as follows:

Master.info file or mysql.slave_master_ info table: it is used to save the connection status, account number, IP, port, password and information such as file and position of the master library binlog currently read by the IO thread of the slave library (called IO thread information log). By default, the connection information and state of the IO thread are saved in the master.info file (the default location is under datadir, and you can use the master_info_file option to execute the master.info file path). If you need to save it in the mysql.slave_master_ information table, you need to set master-info-repository = TABLE before server starts).

Relay-log.info file or mysql.slave_relay_log_ info table: the latest binlog event information obtained from the master database by the IO thread of the slave library is first written to the relay log local to the slave library, and then the SQL thread reads the relay log parsing and replays it The relay_log.info file or mysql.slave_relay_log_info table is used to record the file and position of the latest relay log and the events currently replayed by the SQL thread corresponding to the file and position of the main library binlog (relay log is called relay log, and SQL thread location is called SQL thread information log. By default, the relay log location information and the SQL thread location information are saved in the relay-log.info file (the default location is under datadir, and you can use the relay_log_info_file option to execute the relay-log.info file path). If you need to save it in the mysql.slave_relay_log_ information table, you need to set relay-log-info-repository = TABLE before server starts.

Setting relay_log_info_repository and master_info_repository to TABLE can improve the ability of the database itself or crash recovery after the host terminates unexpectedly (these two tables are innodb tables that ensure that the location information in the table after crash is not lost) and ensure data consistency.

When crash from the library, the SQL thread may have a part of the relay log playback delay, and in addition, the location of the IO thread may also be in the middle of a transaction and is not complete, so the parameter relay-log-recovery=ON must be enabled on the slave library. After this parameter is enabled, the unfinished relay log of the SQL thread will be cleared when the slave crash recovery is enabled, and the IO thread will be reset according to the position of the SQL thread to rerequest from the main library.

If these two tables cannot be initialized by mysqld when the database instance starts, mysqld allows you to continue to start, but writes a warning in the error log, which is often encountered when MySQL upgrades from a version that does not support the table to a version that supports the table.

PS:

Do not try to update the slave_master_info or slave_relay_log_info table manually, or you will suffer the consequences.

While the copy thread from the library continues to work, it does not allow the execution of any statements that may write locks on these two tables, but allows read-only statements to be executed on both tables.

2. Detailed explanation of the copy information table

Since the replication information stored in the tables introduced in this issue is particularly important in our daily database maintenance, some appropriate extensions will be made during the introduction of each table below.

2.1. Slave_master_info this table provides information about the location of the master database read by the query IO thread, as well as the IP, account number, port, password and other information that the slave database connects to the master database. The following is the content of the information stored in the table. Root@localhost: mysql 01:08:29 > select * from slave_master_info\ G * * 1. Row * * Number_of_lines: 25 Master_log_name: mysql-bin.000292 Master_log_pos: 194 Host: 192.168.2.148 User_name: qfsys User_password: letsg0 Port: 3306 Connect_retry: 60 Enabled_ssl: 0 Ssl_ca: Ssl_capath: Ssl_cert: Ssl_cipher: Ssl_key:Ssl_verify_server_cert: 0 Heartbeat: 5 Bind: Ignored_server_ids: 0 Uuid: ec123678-5e26-11e7-9d38-000c295e08a0 Retry_count: 86400 Ssl_crl: Ssl_crlpath: Enabled_auto_position: 0 Channel_name: Tls_version:1 row in set (0.00 sec) table field and show slave status output field, The correspondence of the line information in the master.info file and the meaning of the table field are as follows: the number of rows in the master.info file mysql.slave_master_info table field show slave status command output field description 1Numbernumbers of _ lines [none] indicates the number of lines of information in the master.info or the number of information fields in the slave_master_ info table 2Master_log_nameMaster_Log_File indicates that the latest binlog file name of the main library is currently read from the IO thread of the library It is said that the 3Master_log_posRead_Master_Log_Pos indicates that the IO thread of the slave library is currently reading the latest binlog position4HostMaster_Host of the master library, that the slave library IO thread is currently connecting to the master library IO or the host name 5User_nameMaster_User, that the slave library IO thread is used to connect to the master library user name 6User_ password [none] indicates that the user password 7PortMasterPort used by the slave library IO thread to connect to the master library shows the network port 8Connect _ of the master library connected by the IO thread of the slave library. RetryConnect_Retry indicates the interval between disconnecting from the IO thread of the library and reconnecting to the main library In seconds The default value is 609Enabled_sslMaster_SSL_Allowed, indicating whether the connection between the master and slave is supported. SSL10Ssl_caMaster_SSL_CA_File indicates that CA (Certificate Authority) authentication file name 11Ssl_capathMaster_SSL_CA_Path represents CA (Certificate Authority) authentication file path 12Ssl_certMaster_SSL_Cert represents SSL authentication certificate file name 13Ssl_cipherMaster_SSL_Cipher indicates that the list of passwords that may be used in the SSL connection handshake 14Ssl_keyMaster_SSL_Key The key file name of SSL authentication 15Ssl_verify_server_certMaster_SSL_Verify_Server_Cert indicates whether the certificate of server needs to be verified 16 Heartbeats [none] indicates the interval between master and slave copying heartbeats. 17BindMaster_Bind in seconds indicates that the slave library can be used to connect to the network interface of the master library. A default of empty 18Ignored_server_idsReplicate_Ignore_Server_Ids indicates which server-id needs to be ignored for copying from the library. Note: this is a list. The first number represents the total number of instances server-id that need to be ignored 19UuidMaster_UUID represents the maximum number of times the master library is allowed to reconnect to the slave library 21Sslcrl [none] SSL certificate revocation list file path 22SslslCrlpath [none] directory path to the ssl certificate revocation list file 23Enabled_auto_positionAuto_position indicates whether the slave library enables automatic location finding in the master library ( Use 1 to start automatic location finding If you use auto_position=0, it will not self-couple east to find the location.) 24Channel_nameChannel_name means to copy the channel name from the library, and a channel represents a replication source 25Tls_VersionMaster_TLS_Version that represents the TLS version number 2.2 on Master. Slave_relay_log_info this table provides the location of the main library corresponding to the binary file replayed by the SQL thread and the current latest location of the relay log. The following is the content of the information stored in the table. Root@localhost: mysql 10:39:31 > select * from slave_relay_log_info\ gateway * 1. Row * * Number_of_lines: 7 Relay_log_name: / home/mysql/data/mysqldata1/relaylog/mysql-relay-bin.000205 Relay_log_pos: 14097976 Master_log_name: mysql-bin.000060 Master_log_pos: 21996812 Sql_delay: 0Number_of_workers: 16 Id: 1 Channel_name:1 row in set (0.00 sec) table field and show slave statu s output field, The correspondence of the line information in the relay-log.info file and the meaning of the table field are as follows: the number of rows in the relay-log.info file mysql.slave_relay_log_info table field show slave status command output field description 1Numbernumbers of _ lines [none] indicates the number of lines of information in relay-log.info or the number of information fields in the slave_relay_log_ info table Used for versioning table definition 2Relay_log_nameRelay_Log_File represents the current latest relay log file name 3Relay_log_posRelay_Log_Pos represents the location of the last fully received event corresponding to the current latest relay log file 4Master_log_nameRelay_Master_Log_File represents the main library binlog file name corresponding to the relay log currently being replayed by the SQL thread 5Master_log_posExec_Master_Log_Pos indicates that the SQL thread is currently The relay log being replayed corresponds to the location in the master library binlog file 6Sql_delaySQL_Delay indicates how many seconds the specified slave library must delay the master library 7Numberbatch of _ workers [none] indicates how many worker threads 8Id [None] are currently being replicated from the slave library for each row of records in the internal unique tag table At present, 19Channel_nameChannel_name always indicates that the channel name is copied from the library, which is used for multi-source replication. One channel corresponds to one primary database source.

What is a relay log:

The event data saved in the relay log (relay log) is the same as in the binary log (that is, binary log) (but more information is saved in the relay log) and consists of a set of files that contain event data describing changes to the database, with consecutive numbers followed by file names, as well as an index file containing the names of all relay log files in use.

The data in the relay log is stored in the same format as the binary log, and you can use the mysqlbinlog command to extract the data. By default, the relay log is saved under datadir and the file name format is: host_name-relay-bin.nnnnnn, where host_name is the hostname of the slave server and nnnnnn is the file suffix serial number. Consecutive relay log files are created with consecutive serial numbers starting at 000001. Use index files to track the relay log files that are currently in use. The default relay log index file name is saved under datadir and the file name format is: host_name-relay-bin.index.

* the name of the relay log file and the relay log index file can be overridden by the default value specified by-- relay-log and-- relay-log-index parameter options, respectively. If the file name uses the default value, note that the host name cannot be modified, otherwise an error will be reported that the relay log cannot be opened. It is recommended to use the parameter option to specify a fixed file name prefix. If this has already happened, you need to change the relay log file name in the index file and the relay log file name under datadir to prefix to the new hostname, and then restart the slave library.

Under what circumstances a new relay log file is generated.

When the Imap O thread starts.

When using the statement: FLUSH LOGS or mysqladmin flush-logs command.

When the current size of the relay log file becomes "too large", the log rolling rule is as follows:

* if the value of the max_relay_log_size system variable is greater than 0, the relay log scrolls at the size specified by this parameter.

* if the value of the max_relay_log_size system variable is 0, the relay log scrolls according to the size specified by the max_binlog_size system variable.

After executing the relay log, the SQL thread decides when to clean up the executed relay log files, but if you use the FLUSH LOGS statement or the mysqladmin flush-logs command to force scrolling the relay log, the SQL thread may also clean up the executed relay log files.

2.3. Slave_worker_info this table provides the worker thread status information when querying multithreaded replication, which is different from the performance_schema.replication_applier_status_by_ worker table: the slave_worker_info table records the relay log and main library binlog location information of the worker thread playback, while the performance_schema.replication_applier_status_by_worker table records the GTID location information of the worker thread playback. The following is the content of the information stored in the table. Root@localhost: mysql 01:09:39 > select * from slave_worker_info limit 1\ G * * 1. Row * * Id: 1 Relay_log_name: Relay_log_pos: 0 Master_log_name: Master_log_pos: 0 Checkpoint_relay_log_name: Checkpoint_relay_log_pos: 0Checkpoint_master_log_name: Checkpoint_master_log_pos: 0Checkpoint _ seqno: 0Checkpoint _ group_size: 64 Checkpoint_group_bitmap: Channel_name:1 row in set (0.00 sec) table field meaning. Id: the ID of the data in the table, which is also the ID of the worker thread, corresponds to the WORKER_ID field of the performance_schema.replication_applier_status_by_worker table (if replication stops, the field value still exists, unlike the THREAD_ID field value in the performance_schema.replication_applier_status_by_ worker table is cleared). Relay_log_name: the name of the relay log file to which each worker thread is currently executing. Relay_log_pos: the position in the relay log file to which each worker thread is currently executed. Master_log_name: the name of the main library binary log file to which each worker thread is currently executed. Master_log_pos: the position in the main library binary log file to which each worker thread is currently executed. Checkpoint_relay_log_name: the relay log file name of the latest checkpoint for each worker thread. Checkpoint_relay_log_pos: the position in the relay log file of the latest checkpoint of each worker thread. Checkpoint_master_log_name: the latest checkpoint of each worker thread corresponds to the binary log file name of the main library. Checkpoint_master_log_pos: the latest checkpoint of each worker thread corresponds to the position in the binary log file of the main library. Checkpoint_seqno: the transaction number currently completed by each worker thread. The size of this transaction number is relative to the latest checkpoint of each worker thread, not the real transaction number. Checkpoint_group_size: indicates that when the execution queue of each worker thread is greater than this field value, the current worker thread is triggered to perform a checkpoint. Checkpoint_group_bitmap: the key value for recovery after crash from the library, which is a bitmap value that represents the transaction that each worker thread has executed at its latest checkpoint. Channel_name: replication channel name. For multi-master replication, the specified replication channel name is displayed, and this field is empty for single-master replication. What is recorded in this table is critical to multithreading crash recovery from the library, so the following is a necessary explanation of how what is recorded in this table affects the crash recovery process. How to do replication distribution by multithreading from the library.

We know that transaction-based parallel replication (row-based) has been added to MySQL 5.7. the main library has added new last_commit and sequence_number tags to the GTID event of binlog to indicate the commit order in each group in each binlog (reset the two count marks in each binlog), in each given binlog The last_commit in each group is always the largest sequence_number in the previous group and always the smallest sequence_number-1 in the current group (in each binlog, last_commit is always counted from 0 and sequence_number is always counted from 1). The binlog of the master library recorded from the library relay log does not change the server id, timestamp information and last_commit and sequence_ numbers of the master library, so that when the slave SQL thread performs binlog playback, it can use this information to determine whether the slave library needs to commit strictly according to the commit order of the master library (the transactions replayed from the slave library are only distributed according to the commit order of the master library. However, whether the slave library commits these transactions in accordance with the commit order of the master library also needs to see the slave library's own slave_preserve_commit_order variable setting. If set to 1, it will be committed in strict accordance with the order in relay log, and if set to 0, the slave library will decide the commit order. The principle of SQL thread parallel distribution. * the SQL coordinator thread reads a new transaction and fetches the last_commit and sequence_ number values. * the SQL coordinator thread determines whether the current last_commit of the new transaction fetched is greater than the minimum value in the currently executed sequence_number (Low water mark, referred to as LWM, also known as the low watermark mark). * if the last_commit of the current transaction read by the SQL coordinator thread is greater than the sequence_ number value that has been completed, it means that all the transactions in the previous group have not been completed, and the SQL coordinator thread needs to wait for all worker threads to execute the transactions in the previous group and wait for the LWM to become larger. Until the last_commit of the currently read transaction is equal to the minimum sequence_ number of the currently executed transaction, you can continue to distribute new transactions to idle worker threads (parallel replication is for transactions within each group can be replicated in parallel, so, between group is serial, before the execution of a group is completed, the transaction of the next group needs to wait. Only transactions within the same group can be executed in parallel. According to the above description, the last_commit of a transaction in each group is always the smallest sequence_number-1 in the current group, that is, if the last_commit of the current transaction read by the SQL coordinator thread is less than the minimum sequence_number of the current completed transaction, it means that the transactions currently being executed by all worker threads are in the same group, that is, the SQL coordinator thread can continue to look for idle worker threads to distribute. Otherwise, the SQL coordinator thread needs to wait. * the SQL coordinator thread looks for an idle worker thread by counting the status information returned by the worker thread. If there is no idle thread, the SQL coordinator thread needs to wait until an idle worker thread is found (if there are multiple worker threads, the SQL coordinator thread randomly selects an idle worker thread for distribution). * distribute the binlog event of the currently read transaction to the selected idle worker thread, after which the worker thread applies the transaction, and then the SQL coordinator thread continues to read the new binlog event (note that the SQL coordinator thread distribution is in event units, not transaction units, so, if the first event of a transaction is distributed to a given worker thread, the new event subsequently read if it belongs to the same transaction All event before moving on to the next transaction is distributed to the same worker thread for processing. The SQL coordinator thread repeats the above determination process when all binlog event groups in a transaction are distributed and the next new transaction is read. Crash recovery copied from the library by multithreading. From the previous principle of multithreaded replication and distribution, we can see that transactions in the same group are applied in parallel, and transactions are randomly assigned. During the normal operation of the library, if you pinch at any moment, which of the transactions that are being executed by the worker thread have already been executed? It is not possible to use a single location to determine what is not done yet (because it may be out of order when copying in parallel from the library: you need to see how the slave_preserve_commit_order variable is set), which means that there may be a breakpoint between the maximum and minimum locations that are being executed in all worker threads. So how does MySQL solve the problem of continuous work from the breakpoint of the library crash recovery? In order to solve this problem, MySQL has done a lot of recording work on the execution status of the worker thread. First, it maintains a queue called GAQ (Group Assigned Queue). When the SQL coordinator thread allocates a transaction, it will first add the transaction to the queue, and then it will look for an idle worker thread to execute according to the rules. The following figure (solemnly declared: this figure is from the book "MySQL Operation and maintenance Internal reference"):

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.