Analysis of double write consistency Scheme for distributed Database and Cache 07/06 Update SLTechnology News&Howtos

Analysis of double write consistency Scheme for distributed Database and Cache

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Let's make it clear that, in theory, setting an expiration time for the cache is the solution to ensure ultimate consistency. Under this scheme, we can set the expiration time for the data stored in the cache. All write operations are based on the database, and we just try our best to cache the data. That is, if the database is successfully written and the cache update fails, as long as the expiration time is reached, the subsequent read request will naturally read the new value from the database and backfill the cache. Therefore, the idea discussed next does not depend on the scheme of setting an expiration time for the cache.

Here, we discuss three update strategies:

Update the database before updating the cache

Delete the cache before updating the database

Update the database before deleting the cache

No one should ask me why I don't have the strategy of updating the cache before updating the database.

(1) update the database first, and then update the cache

This proposal is generally opposed by everyone. Why? There are two reasons.

Reason 1 (thread safety perspective)

If request An and request B update operation at the same time, then there will be

(1) Thread A updates the database

(2) Thread B updates the database

(3) Thread B updates the cache

(4) Thread A updates the cache

This shows that the request A to update the cache should be earlier than the request B to update the cache, but due to network and other reasons, B updated the cache earlier than A. This leads to dirty data, so it is not considered.

Reason 2 (business scenario perspective)

There are two points as follows:

(1) if you are a business requirement with more write database scenarios and fewer read data scenarios, the cache will be updated frequently before the data is read, resulting in a waste of performance.

(2) if you write the value to the database, it is not written directly to the cache, but is written to the cache after a series of complex calculations. Then, after each write to the database, the write cache value is calculated again, which is undoubtedly a waste of performance. Obviously, it is more appropriate to delete the cache.

The next discussion is the most controversial, first delete the cache, and then update the database. It is still a question of updating the database before deleting the cache.

(2) delete the cache before updating the database

The reason why this scheme will lead to inconsistency is. At the same time, one request A for update operation, and the other request B for query operation. Then the following situations will occur:

(1) request A to write and delete the cache

(2) request B query found that cache does not exist.

(3) request B to query the database to get the old value.

(4) request B to write the old value to the cache

(5) request A to write the new value to the database

This will lead to inconsistencies. Moreover, if you do not set the expiration policy for the cache, the data will always be dirty.

So, how to solve it? Using delayed double deletion strategy

The pseudo code is as follows

Public void write (String key,Object data) {

Redis.delKey (key)

Db.updateData (data)

Thread.sleep (1000)

Redis.delKey (key)

}

Public void write (String key,Object data) {

Redis.delKey (key)

Db.updateData (data)

Thread.sleep (1000)

Redis.delKey (key)

}

The translation into Chinese description is

(1) eliminate cache first

(2) write the database again (these two steps are the same as before)

(3) hibernate for 1 second and eliminate the cache again

By doing so, the cache dirty data caused by 1 second can be deleted again.

So, how is this 1 second determined, and how long should it be dormant?

In response to the above situation, readers should evaluate the time-consuming business logic of reading data for their own projects. Then the dormancy time of writing data can be increased by several hundred ms on the basis of reading data business logic. The purpose of this is to ensure that the read request ends, and the write request removes the cache dirty data caused by the read request.

What if you use mysql's read-write separation architecture?

Ok, in this case, the reasons for the data inconsistency are as follows: two requests, one for A for update operation and the other for B for query operation.

(1) request A to write and delete the cache

(2) request A to write the data to the database

(3) request B query cache found that the cache has no value.

(4) request B to query the slave database. At this time, the master-slave synchronization has not been completed, so the query results are the old values.

(5) request B to write the old value to the cache

(6) the database completes the master-slave synchronization and changes from the database to the new value.

The above situation is the reason for the inconsistency of the data. Or use the double deletion delay strategy. However, the sleep time is modified to add hundreds of ms on the basis of the delay time of master-slave synchronization.

With this synchronous elimination strategy, what if the throughput is reduced?

Ok, then delete the second time as asynchronous. Start your own thread and delete it asynchronously. In this way, the written request does not have to sleep for a while and then return. By doing so, increase the throughput.

The second deletion, what if the deletion fails?

This is a very good question, because if the second deletion fails, the following situation occurs. There are still two requests, one for A for update operation and the other for B for query operation. For convenience, it is assumed to be a single database:

(1) request A to write and delete the cache

(2) request B query found that cache does not exist.

(3) request B to query the database to get the old value.

(4) request B to write the old value to the cache

(5) request A to write the new value to the database

(6) request An attempted to delete request B to write to the cache value, but failed.

Ok, that means. If you fail to delete the cache the second time, the cache and database inconsistencies will occur again.

How to solve the problem?

For the specific solution, let's take a look at the blogger's analysis of the update strategy (3).

(3) update the database before deleting the cache

First of all, let's talk about it first. Foreigners have proposed a cache update routine called "Cache-Aside pattern". It is pointed out that

Invalidation: the application fetches the data from the cache first, and if it doesn't get it, it fetches the data from the database, and after success, it puts it in the cache.

Hit: the application fetches the data from the cache, fetches it and returns.

Update: save the data in the database first, and then invalidate the cache after success.

In addition, facebook, a well-known social networking site, also proposed in its paper "Scaling Memcache at Facebook" that they also use the strategy of updating the database before deleting the cache.

Is there no concurrency problem in this situation?

No. Suppose there are two requests, one for A to do the query operation and the other for B to do the update operation, then the following situations will occur

(1) the cache just expires

(2) request A to query the database to get an old value.

(3) request B to write the new value to the database

(4) request B to delete the cache

(5) request A to write the old values found to the cache

Ok, if this happens, dirty data will indeed occur.

But what are the odds of this happening?

There is a congenital condition for the occurrence of the above situation, that is, the writing operation of step (3) takes less time than the read operation of step (2), which makes it possible for step (4) to precede step (5). However, if you think about it, the speed of the read operation of the database is much faster than that of the write operation (otherwise, why else do the read-write separation? the meaning of the read-write separation is because the read operation is faster and consumes less resources), so step (3) takes less time than step (2). This situation is very difficult to happen.

Suppose, some people have to argue, have obsessive-compulsive disorder, must solve how to do?

How to solve the above concurrency problem?

First of all, setting an effective time for the cache is a solution. Secondly, the asynchronous delay deletion strategy given in strategy (2) is adopted to ensure that the deletion operation will be carried out after the read request is completed.

Are there any other causes of inconsistency?

Yes, this is also a problem with both cache update strategy (2) and cache update strategy (3). If caching fails, there will be inconsistencies. For example, a write data request, and then write to the database, delete cache failed, which will be inconsistent. This is also the last question left in the cache update strategy (2).

How to solve?

Just provide a guaranteed retry mechanism, here are two sets of solutions.

Option 1:

As shown in the following figure

The process is as follows

(1) Update database data

(2) failed to delete cache due to various problems

(3) send the key to be deleted to the message queue

(4) consume messages and get the key to be deleted.

(5) continue to retry the delete operation until it is successful

However, this scheme has a drawback, which causes a large number of intrusions into the business line code. So there is scenario 2, in scenario 2, start a subscriber to subscribe to the binlog of the database and get the data you need to operate. In the application, start another program, get the information from this subscriber, and delete the cache.

Option 2:

The process is shown in the following figure:

(1) Update database data

(2) the database will write the operation information to the binlog log.

(3) the subscriber extracts the required data and key

(4) create another piece of non-business code to obtain this information

(5) attempted to delete the cache and found that the deletion failed

(6) send this information to the message queue

(7) retrieve the data from the message queue and retry the operation.

Note: the above subscription binlog program has a ready-made middleware called canal in mysql, which can complete the function of subscribing to binlog logs. As for oracle, bloggers don't know if there is ready-made middleware available. In addition, the retry mechanism, the blogger is using the way of message queue. If the requirement for consistency is not very high, just set up another thread in the program and try again every once in a while. Everyone can play these things flexibly and freely, just to provide an idea.

Summary

This article is actually a summary of the existing consistency schemes in the Internet. For the update strategy of deleting the cache and then updating the database, and the proposal to maintain a memory queue, the blogger looked at it and found that the implementation was extremely complex and unnecessary, so it was not necessary to give it in this article. Finally, I hope you can get something.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.