How to analyze the consistency of double writes between database and cache 07/09 Update SLTechnology News&Howtos

How to analyze the consistency of double writes between database and cache

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about how to analyze the double write consistency between database and cache, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following for you. I hope you can get something according to this article.

Caching is widely used in projects because of its high concurrency and high performance. The read cache process is shown below:

There are three requirements for double write consistency:

The cache cannot read dirty data

The cache may read expired data, but achieve final consistency within a tolerable time

This tolerable time is as small as possible.

To satisfy the above three items at the same time, read and write requests can be serialized into a memory queue to ensure that there will be no inconsistencies. However, after serialization, the throughput of the system will be greatly reduced, using several times more machines than normal to support online requests.

So, here, we discuss three common methods:

Update the database before updating the cache

Delete the cache before updating the database

Update the database before deleting the cache

1. Update the database before updating the cache

This method is generally opposed by everyone, and the reasons focus on the following two points:

Reason 1: thread safety perspective.

If request An and request B update operation at the same time, then the following will appear:

Thread A updates the database

Thread B updates the database

Thread B updated the cache

Thread A updated the cache

This shows that the request A to update the cache should be earlier than the request B to update the cache, but due to network and other reasons, B updated the cache earlier than A. This leads to dirty data, so it is not considered.

By the same token, the scheme of "update the cache before updating the database" also causes dirty data, so it is not considered.

Reason 2: business scenario perspective.

There are two points as follows:

If you are a business requirement with more write database scenarios and fewer read data scenarios, adopting this solution will cause the cache to be updated frequently before the data is read at all, wasting performance.

If you write the value to the database, it is not written directly to the cache, but is written to the cache after a series of complex calculations. Then, after each write to the database, the write cache value is calculated again, which is undoubtedly a waste of performance. Obviously, it is more appropriate to delete the cache.

If you must update the cache, consider adding a version number to the cached data

two。 Delete the cache before updating the database

The scheme will also lead to inconsistencies. If request An and request B update operation at the same time, then the following will appear:

Request A to write to delete the cache

Request B query found that cache does not exist

Request B to query the database to get the old value.

Request B to write the old value to the cache

Requesting A to write the new value to the database will result in an inconsistency. Moreover, if you do not set the expiration policy for the cache, the data will always be dirty.

Solution:

Delete the cache first

Write the database again (these two steps are the same as before)

Sleep for a certain amount of time (for example, 1 second or 200ms) and delete the cache again. By doing so, the cache dirty data can be deleted again.

However, this solution still has a great impact on throughput because it hibernates threads.

3. Update the database before deleting the cache

This kind of scheme is adopted by many projects. Let's see whether it is safe or not.

If there are two requests, one for A to do the query operation and the other for B to do the update operation, the following situations will occur

The cache just expires.

Request A to query the database to get an old value

Request B to write the new value to the database

Request B to delete the cache

Request A to write the old values found to the cache

In this way, dirty data is generated, but the above situation assumes that the write request is faster than the read request in the database. In fact, the read operation of the database in the project is much faster than the write operation.

Either through 2PC or Paxos protocol to ensure consistency, or try to reduce the probability of dirty data when concurrency, probably because 2PC is too slow, and Paxos is too complex, taken together, Facebook chose this third solution.

What if I fail to delete the cache?

Start a subscriber to subscribe to the binlog of the database and get the data you need to operate. In the application, start another program, get the information from this subscriber, and delete the cache.

After reading the above, do you have any further understanding of how to analyze the double write consistency between database and cache? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.