How to solve the problem of master-slave delay and read-write separation in MySQL 04/24 Update SLTechnology News&Howtos

How to solve the problem of master-slave delay and read-write separation in MySQL

2025-04-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

Editor to share with you how to solve the problem of master-slave delay and the separation of reading and writing in MySQL, I believe most people do not know much about it, so share this article for your reference. I hope you will gain a lot after reading this article. Let's learn about it together.

Foreword:

We all know that there is a characteristic of Internet data, most of the scenes are read more than write, such as: Weibo, Wechat, Taobao e-commerce, according to the 28 principle, read traffic can even reach 90%.

Combined with this feature, we will also adjust the underlying database architecture accordingly. The separation of reading and writing is adopted.

Process:

The client integrates SDK, and each time the SQL is executed, it determines whether it is a write or read operation.

If you are writing SQL, the request will be sent to the main library.

The master database executes SQL, and after the transaction is committed, the binlog is generated and synchronized to the slave library.

The slave library plays back the binlog through the SQL thread and generates the corresponding data from the library table.

If you are reading SQL, the request will select a slave database to process the user's request through the load balancing strategy.

It seems very reasonable, but when you think about it, that's not the case.

The master library and the slave library replicate data asynchronously. What if the data is not synchronized between the two?

The main library has just finished writing the data, and before it has time to pull the latest data from the library, the read request comes, giving the user the impression that the data is lost?

Today, let's discuss what solutions are available to solve this problem.

I. compulsory use of the main library

For unused business demands, differential treatment.

Scenario 1:

If the real-time requirements of the data are not very high, for example: big V has tens of millions of fans, post a Weibo, fans receive this message a few seconds later, it will not have a great impact. At this point, you can walk from the library.

Scenario 2:

If the real-time requirements of the data is very high, such as financial business. We can force the query to go to the main library under the client code tag.

Second, delay the query from the database

Since data synchronization between master and slave libraries requires a certain amount of time, one strategy is to delay querying data from slave libraries.

For example:

Select sleep (1) select * from order where order_id=11111

In the formal business query, first execute a sleep statement to reserve a certain data synchronization buffer period for the slave database.

Because it adopts an one-size-fits-all approach, when faced with high concurrency business scenarios, the performance will be greatly degraded, so this solution is generally not recommended.

Third, to judge whether the master and follower is delayed? Decide whether to choose master library or slave library

I wrote an article "JD.com one side: what are the pits of MySQL active and standby delay?" Active and standby switching strategy.

Have you talked about what is the active / standby delay and the common reasons for the active / standby delay?

Option 1:

Execute the command show slave status from the slave library.

Check the value of seconds_behind_master (in seconds). If 0, there is no delay between master and slave libraries.

Option 2:

Compare the file points of the master-slave library.

Still execute show slave status, and there is a cut key parameter in the response result.

The latest files in the main library read by Master_Log_File.

The coordinate location of the latest file of the main library read by Read_Master_Log_Pos.

The latest file that Relay_Master_Log_File executes from the library.

The coordinate location of the latest file executed by Exec_Master_Log_Pos from the library.

Pairwise comparison, whether the above parameters are equal.

Option 3:

Compare GTID collections.

GTID protocol is used between Auto_Position=1 master and slave.

The GTID collection of all binlog logs received by Retrieved_Gtid_Set from the library.

Executed_Gtid_Set the completed collection of GTID that has been executed from the library.

Compare the values of Retrieved_Gtid_Set and Executed_Gtid_Set for equality.

When performing the business SQL operation, it is first determined whether the slave database has synchronized the latest data. Thus decide whether to operate the master library or the slave library.

Disadvantages:

No matter which of the above schemes is adopted, if the master database writes frequently, the value of the slave library can never keep up with that of the master library, and the read traffic will always be on the master library.

1. Is there any solution to this problem?

This problem is the same as that of MQ message queues, which requires high throughput and the same order. from a global point of view, there is no solution, but it is much easier to narrow the scope, and we can ensure that the messages in a partition are orderly.

Back to the problem of data synchronization between master and slave databases, as long as we make sure that the corresponding write binglog has synchronized the data, we only need to make sure that all the transactions of the master / slave database are synchronized or not.

The question is whether it is much easier at once.

Fourth, judge the main library site from the library node.

After executing the following command from the slave library, the return is a positive integer M, indicating how many transactions have been executed from the parameter node from the library.

Select master_pos_wait (file, pos [, timeout])

File and pos represent the file name and location on the main library.

Timeout is optional, indicating that the function waits for up to N seconds.

Disadvantages:

The result returned by master_pos_wait cannot be associated with the data row of a specific operation, so every time a read request is received, the slave database still cannot confirm whether the data has been synchronized, so the solution is not practical.

5. Compare GTID

Execute the following query command:

Block waiting until the transaction executed from the library contains gtid_set and returns 0.

Timeout, return 1.

Select wait_for_executed_gtid_set (gtid_set, 1)

MySQL version 5.7.6 allows the GTID of the transaction to be returned to the client after the update transaction has been executed. For the specific operation, set the parameter session_track_gtids to OWN_GTID, and call the API API mysql_session_track_get_first to return the result to resolve the GTID.

Processing flow:

Initiates a write SQL operation and returns the GTID of the transaction after the main library executes successfully.

When initiating a read SQL operation, select wait_for_executed_gtid_set (gtid_set, 1) is executed first in the slave library.

If 0 is returned, the data has been synchronized from the slave library, and the query operation can be performed from the slave library.

Otherwise, the query operation is performed in the main library.

Disadvantages:

Similar to the master_pos_wait above, the GTID cannot be passed if the write operation is not contextually related to the read operation. The scheme is not practical.

Introduction of caching middleware

High concurrency system, cache, as a sharp tool for performance optimization, is widely used. We can consider introducing a cache as a buffer medium.

Process:

The client writes SQL and operates the main library.

Synchronize deletes the data in the cache.

When the client reads data, it takes precedence to load from the cache.

If it is not in the cache, the query main database is forced to warm up the data.

Disadvantages:

Kmurv storage, suitable for some simple query condition scenarios. If you have a complex query, you still have to query the slave library.

VII. Data slicing

Referring to the Redis Cluster model, the cluster network topology is usually 3 masters and 3 slaves, and the master node is responsible for both writing and reading.

Through horizontal slicing, the horizontal expansion of data is supported. Because each node is an independent server, the throughput of the overall cluster can be improved.

1. Switch to the database aspect

The common solution is sub-database and sub-table. each read and write is a sub-table of the master database, and the slave database is only used for data backup. When the master database fails, the master-slave switch ensures the high availability of the cluster.

The above is all the contents of this article entitled "how to solve the problem of master-slave delay and the separation of reading and writing in MySQL". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.