How to ensure the consistency of double writes between cache and database 07/06 Update SLTechnology News&Howtos

How to ensure the consistency of double writes between cache and database

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article focuses on "how to ensure the double write consistency between cache and database". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "how to ensure the double write consistency between cache and database".

As long as you use cache, it may involve dual storage and double write of cache and database. As long as you are double write, there must be a problem of data consistency, so how do you solve the problem of consistency?

Analysis of interview questions

Generally speaking, if cache is allowed to be slightly inconsistent with the database occasionally, that is, if your system does not strictly require "cache + database" to be consistent, it is best not to do this solution, that is, read and write requests are serialized and serialized into a memory queue.

Serialization ensures that there will be no inconsistencies, but it can also lead to a significant reduction in the throughput of the system, using several times more machines than normal to support online requests.

Cache Aside Pattern

The most classic caching + database read-write mode is Cache Aside Pattern.

When reading, first read the cache, if there is no cache, read the database, then take out the data and put it into the cache, and return the response.

When updating, update the database before deleting the cache.

Why delete the cache instead of updating it?

The reason is simple: in many cases, in complex cache scenarios, caching is not just a value directly fetched from the database.

For example, a field of a table may be updated, and then the corresponding cache needs to query the data of the other two tables and operate before the latest value of the cache can be calculated.

In addition, the cost of updating the cache can be high. Does it mean that every time you modify a database, you must update its corresponding cache? This may be the case in some scenarios, but not in more complex scenarios where data calculations are cached. If you frequently modify multiple tables involved in a cache, the cache is also updated frequently. But the question is, will this cache be accessed frequently?

Take Chestnut, for example, the field of a table involved in a cache is modified 20 or 100 times in a minute, then the cache is updated 20 or 100 times, but the cache is only read once in a minute and there is a lot of cold data. In fact, if you just delete the cache, the cache will only be recalculated once in 1 minute, and the overhead will be greatly reduced.

In fact, deleting the cache, rather than updating the cache, is the idea of lazy computing. Don't redo the complex calculation every time, whether it will be used or not, but let it be recalculated when it needs to be used. Like mybatis,hibernate, they all have lazy loading thoughts. Query a department, the department brought an employee's list, it is not necessary to say that every query department, the data of 1000 employees are also checked out at the same time. In 80% of cases, if you check this department, you just need to access the information of this department. Check the department first and visit the employees inside at the same time, so only when you want to access the employees inside will you go to the database to query 1000 employees.

The most basic cache inconsistency problem and its solution

Problem: modify the database before deleting the cache. If you fail to delete the cache, it will result in new data in the database and old data in the cache, resulting in data inconsistencies.

Solution: delete the cache first, and then modify the database. If the database modification fails, then there is old data in the database and empty in the cache, so the data will not be inconsistent. Because there is no cache when reading, the old data in the database is read and then updated to the cache.

Analysis of complex data inconsistencies

The data has changed, first delete the cache, and then to modify the database, which has not been modified at this time. A request comes over, reads the cache, finds that the cache is empty, queries the database, finds the old data before modification, and puts it in the cache. Then the program of data change completes the modification of the database.

It's over, the data in the database is different from the data in the cache.

Why does cache have this problem in scenarios with hundreds of millions of traffic and high concurrency?

This problem can only occur when a data is being read and written concurrently. In fact, if your concurrency is very low, especially if the read concurrency is very low, with 10,000 visits per day, then in rare cases, there will be the kind of inconsistent scenario just described. But the problem is, if there are hundreds of millions of traffic every day, tens of thousands of simultaneous reads per second, as long as there are requests for data updates per second, the above database + cache inconsistencies may occur.

Solutions are as follows:

When updating the data, according to the unique identification of the data, the operation is routed and sent to an internal queue of jvm. When reading the data, if it is found that the data is not in the cache, the operation of re-reading the data + updating the cache will be re-read and sent to the same jvm internal queue after the route is uniquely identified.

A queue corresponds to a worker thread, each worker thread gets the corresponding operation in series, and then executes one by one. In this case, a data change operation, first delete the cache, and then update the database, but has not yet completed the update. At this point, if a read request comes and reads the empty cache, you can first send the request for cache update to the queue, which will backlog in the queue, and then wait for the cache update to be completed synchronously.

There is an optimization point here. In a queue, it is meaningless for multiple update cache requests to be strung together, so filtering can be done. If there is already a request to update the cache in the queue, then there is no need to put an update request operation into it, just wait for the previous update operation request to be completed.

After the worker thread corresponding to that queue completes the modification of the database of the previous operation, it will perform the next operation, that is, the cache update operation, which will read the latest value from the database and write it to the cache.

If the request is still within the waiting time range, and continuous polling finds that the value can be obtained, it will be returned directly; if the request waits for more than a certain amount of time, the current old value will be read directly from the database this time.

In high concurrency scenarios, the problems that this solution should pay attention to:

1. Read requests are blocked for a long time

Because the read request is very slightly asynchronized, it is important to pay attention to the read timeout problem, and each read request must be returned within the timeout range.

The biggest risk point of this solution is that the data may be updated frequently, resulting in a large backlog of update operations in the queue, and then a large number of read requests will time out, resulting in a large number of requests going directly to the database. Be sure to pass some real-world tests to see how often the data is updated.

In addition, because there may be a backlog of update operations for multiple data items in a queue, you need to test according to your own business situation, and you may need to deploy multiple services, each of which allocates some data update operations. If the inventory modification operation of 100 items is squeezed in a memory queue, and every other inventory modification operation takes 10ms to complete, then the read request of the last item may wait for 10 * 100 = 1000ms = 1s to get the data, which will lead to long-term blocking of the read request.

Be sure to do some stress tests and simulate the online environment according to the operation of the actual business system to see how many update operations the memory queue may squeeze during the busiest time, which may lead to the read request corresponding to the last update operation, how long it will be hang, and if the read request is returned in 200ms, if you calculate, even during the busiest time, the backlog of 10 update operations will be at most waiting for 200ms That's okay.

If there may be a backlog of update operations in a memory queue, then you need to add machines so that the service instances deployed on each machine handle less data, and the less update operations there will be in each memory queue.

In fact, based on previous project experience, generally speaking, the frequency of data writing is very low, so in fact, normally, there should be very little update backlog in the queue. For projects like this with high read concurrency and read cache architecture, write requests are generally very small, and hundreds of QPS per second would be nice.

Make a rough calculation in practice.

If there are 500write operations per second, if divided into five time slices, 100write operations per 200ms, put into 20 memory queues, each memory queue, there may be a backlog of 5 write operations. After the performance test of each write operation, it is usually completed around 20ms, so the read request for the data of each memory queue will only be hang for a while at most, and will definitely be returned within 200ms.

After a simple calculation just now, we know that it is no problem to write QPS supported by a single machine in hundreds. If writing QPS is expanded by 10 times, then the machine capacity will be expanded by 10 times, with 20 queues for each machine.

2. The concurrency of read requests is too high

Stress tests must also be done here to ensure that when the above situation happens, there is another risk, that is, suddenly a large number of read requests will hang the service with a delay of tens of milliseconds to see if the service can handle it and how many machines are needed to withstand the peak of the maximum limit.

However, because not all data is updated at the same time, and the cache will not expire at the same time, each time, that is, the cache of a small number of data may fail, and then the corresponding read requests for those data will come, and the amount of concurrency should not be very large.

3. Request routing for multi-service instance deployment

It is possible that this service has multiple instances deployed, so it is important to ensure that requests to perform data update operations, as well as cache update operations, are routed to the same service instance through the Nginx server.

For example, read and write requests for the same product are all routed to the same machine. You can do hash routing among services according to a certain request parameter, or you can use Nginx's hash routing function and so on.

4. The routing problem of hot commodities leads to the tilt of requests.

If the request for reading and writing of a commodity is particularly high, all of them are sent to the same queue of the same machine, which may cause too much pressure on a certain machine. That is to say, because the cache will be cleared only when the commodity data is updated, and then it will lead to read-write concurrency, it should be viewed according to the business system. If the update frequency is not too high, the impact of this problem is not particularly great, but it is possible that some machines will have a higher load.

At this point, I believe you have a deeper understanding of "how to ensure the consistency of double writes between cache and database". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.