How to solve the problem of cache inconsistency with redis 07/06 Update SLTechnology News&Howtos

How to solve the problem of cache inconsistency with redis

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

In this article, the editor introduces in detail "redis how to solve the problem of cache inconsistency". The content is detailed, the steps are clear, and the details are handled properly. I hope that this "redis how to solve the problem of cache inconsistency" article can help you solve your doubts.

How does the data inconsistency between cache and database occur?

First of all, we need to know exactly what "data consistency" means. In fact, the "consistency" here includes two situations:

If there is data in the cache, then the cached data value needs to be the same as the value in the database

If there is no data in the cache itself, the value in the database must be the latest value.

If it does not meet these two situations, it is a problem of data inconsistency between the cache and the database. However, when the cache read and write mode is different, the occurrence of cache data inconsistency is different, and our response methods will be different. Therefore, we first follow the cache read and write mode to understand the cache inconsistencies in different modes. We can divide the cache into read-write cache and read-only cache.

For the read-write cache, if you want to add or delete the data, you need to do it in the cache and decide whether to write back to the database synchronously according to the write-back strategy adopted.

Synchronous write strategy: when you write to the cache, you also write to the database synchronously, and the cache is consistent with the data in the database.

Asynchronous writeback strategy: write to the database at different steps when writing to the cache, and then write back to the database when the data is eliminated from the cache. When using this strategy, if the cache fails before the data is written back to the database, then the database has no up-to-date data.

Therefore, for the read-write cache, in order to ensure that the cache is consistent with the data in the database, it is necessary to adopt a synchronous direct write strategy. It is important to note, however, that if this strategy is adopted, both the cache and the database need to be updated. Therefore, we need to use the transaction mechanism in business applications to ensure that the cache and database updates are atomic, that is, either they are not updated together, or both are not updated, error messages are returned and try again. Otherwise, we will not be able to achieve synchronous direct writing.

Of course, in some scenarios, our requirements for data consistency may not be so high, such as caching the non-critical attributes of e-commerce products or the creation or modification time of short videos, then we can use asynchronous write-back strategy.

Let's talk about read-only caching again. For read-only cache, if there is new data, it will be written directly to the database, while when there is data deletion, you need to mark the data in read-only cache as invalid. In this way, when the application accesses the added, deleted and modified data later, the cache will be missing because there is no corresponding data in the cache. At this point, the application reads the data from the database into the cache, so that when the data is accessed later, it can be read directly from the cache.

Next, take Tomcat writing and deleting data to MySQL as an example to explain how the data addition, deletion and modification operation is carried out, as shown in the following figure:

You can see from the figure that applications running on Tomcat, whether they are adding (Insert operation), modifying (Update operation), or deleting (Delete operation) data X, will directly add, change or delete data in the database. Of course, if the application performs a modify or delete operation, the cached data X is also deleted.

So, will there be data inconsistencies in this process? Considering that the new data is different from the deleted data, let's look at it separately.

New data

If the new data is added, the data will be written directly to the database without any action on the cache. At this point, there is no new data in the cache itself, and the database is the latest value. This situation is in line with the second case of consistency we just mentioned, so at this time, the data of the cache and the database are the same.

Delete and modify data

If a delete operation occurs, the application not only updates the database, but also deletes the data in the cache. If the atomicity of these two operations cannot be guaranteed, that is to say, either both are completed or neither is completed, there will be data inconsistencies. This problem is rather complicated. Let's analyze it.

We assume that the application deletes the cache first, and then updates the database. If the cache is deleted successfully, but the database update fails, then when the application accesses the data again, there is no data in the cache, and the cache is missing. The application then accesses the database, but if the value in the database is the old value, the application accesses the old value.

Let me give you an example. You can take a look at the following picture first.

To update the value of data X from 10 to 3, the application first deleted the cache value of X from the Redis cache, but failed to update the database. If there are other concurrent requests to access X at this time, you will find that the cache is missing in Redis, and then the request accesses the database and reads the old value of 10.

You might ask, can we solve this problem if we update the database first and then delete the values in the cache? Let's analyze it again.

If the application completes the update of the database first, but fails to delete the cache, then the value in the database is the new value, while the value in the cache is the old value, which must be inconsistent. At this time, if there are other concurrent requests to access the data, according to the normal cache access process, it will be queried in the cache first, but at this point, the old value will be read.

Let me use an example to illustrate.

In order to update the value of data X from 10 to 3, the application successfully updates the database, and then deletes the cache of X in the Redis cache, but this operation fails. At this time, the new value of X in the database is 3, and the cache value of X in Redis is 10, which must be inconsistent. If another client also sends a request to access X at this time, it will first query in Redis, and the client will find that the cache is hit, but read the old value of 10.

Well, at this point, we can see that in the process of updating the database and deleting cached values, no matter who performs the two operations first or later, as long as one operation fails, it will cause the client to read the old value. I drew the following table to summarize the two situations just mentioned.

We know the cause of the problem, so how to solve it?

How to solve the problem of data inconsistency?

First of all, let me introduce you to a method: the retry mechanism.

Specifically, you can temporarily store the cache value to be deleted or the database value to be updated in the message queue (for example, using Kafka message queue). When the application fails to successfully delete cached values or update database values, you can reread those values from the message queue and then delete or update them again.

If we can successfully delete or update, we need to remove these values from the message queue to avoid repetitive operations, and at this point, we can also ensure that the database and cached data are consistent. Otherwise, we need to try again. If the number of retries exceeds a certain number of times and is still unsuccessful, we need to send an error message to the business layer.

The following figure shows that when you update the database and then delete the cache value, if the cache deletion fails and the deletion is successful after retrying, you can take a look.

What I just said is that in the process of updating the database and deleting cache values, one of the operations fails, in fact, even if both operations do not fail for the first time, when there are a large number of concurrent requests, it is still possible for the application to read inconsistent data.

Similarly, we are divided into two cases according to the different order of deletion and update. In these two cases, our solutions are also different.

Case 1: delete the cache before updating the database.

Suppose thread A deletes the cache value and thread B starts to read the data before updating the database (for example, there is a network delay), then thread B will find that the cache is missing and can only go to the database to read. This leads to two problems:

Thread B read the old value

Thread B reads the database in the absence of a cache, so it also writes the old value to the cache, which may cause other threads to read the old value from the cache.

Thread A does not update the database until thread B has read the data from the database and updated the cache. At this point, the data in the cache is the old value, while the value in the database is the latest value, the two are inconsistent.

I'll use a table to summarize the situation.

What should I do about this? Let me offer you a solution.

After thread A updates the database value, we can have it sleep for a short period of time and then perform a cache delete operation.

The reason for adding sleep for this period of time is to enable thread B to read data from the database, write the missing data to the cache, and then delete thread A. Therefore, the time of thread A sleep needs to be greater than the time it takes for thread B to read and then write to the cache. How can I be sure of this time? It is recommended that you count the operation time of downthread read data and write cache while the business program is running, and estimate it on this basis.

In this way, when other threads read the data, they find that the cache is missing, so the latest values are read from the database. Because this scheme delays deletion for a period of time after the cache value is deleted for the first time, we also call it "deferred double deletion".

The pseudo code below is an example of a "delayed double deletion" scenario, which you can take a look at.

Redis.delKey (X) db.update (X) Thread.sleep (N) redis.delKey (X)

Case 2: update the database value first, and then delete the cache value.

If thread A deletes the value in the database, but thread B starts to read the data before deleting the cache value, then when thread B queries the cache and finds that the cache hits, it will read the old value directly from the cache. In this case, however, if there are not many requests from other threads to read the cache, there will not be many requests to read the old value. Also, thread A usually deletes the cache value very quickly, so that when other threads read it again, the cache is missing and the latest value is read from the database. Therefore, this situation has less impact on the business.

I'll draw another table to show you how to update the database and then delete the cache value.

Well, at this point, we have learned that the data inconsistencies between the cache and the database are generally caused by two reasons, and I have provided you with the corresponding solution.

The data is inconsistent due to the failure to delete the cache value or update the database. You can use the retry mechanism to ensure that the delete or update operation is successful.

In the two steps of deleting the cache value and updating the database, there are concurrent read operations of other threads, causing other threads to read the old values, and the solution is to delay double deletions.

After reading this, the article "how to solve the problem of cache inconsistency in redis" has been introduced. If you want to master the knowledge points of this article, you still need to practice and use it yourself. If you want to know more about related articles, welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.