What is the double writing mode? 07/04 Update SLTechnology News&Howtos

What is the double writing mode?

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains "what is the way of double writing". Interested friends might as well take a look. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "what is the double writing mode"?

What is double writing?

Let's get to the point, which is easy to understand. Double write means that a copy of data is stored in the data inventory and stored in the cache, giving the cache an expiration time, reading from the database and writing to the cache when the cache cannot be read.

Why do you need double writing?

When the number of requests is getting larger and larger, the system will slowly appear a bottleneck. Because the links to the database are limited and cannot support a higher QPS, we have to find a way to share the pressure on the database, so we have double writes to write the data to the cache, and the client reads the data directly from the cache, which can improve the performance of the system.

However, if you want to use double write, there will always be an interval between updating the cache or updating mysql first, so make sure that your business allows temporary data inconsistencies to some extent, otherwise, it is not recommended.

So someone asked? Must double writing not guarantee strong consistency?

The answer is yes, as long as all read and write requests associated with it are serialized with queues to ensure strong consistency of double writes, but this will greatly reduce the QPS of the system, which is not recommended.

Since it is necessary to double write, then there is bound to be inconsistency between the database and the cached data, how to avoid it?

How to solve the problem of double writing inconsistency?

one。 Update the database before updating the cache

What's wrong with this situation? Let's take a look at the following picture:

First of all, a updates the database and follows the normal process, and then a thread is asked to delete the cache, but suddenly there is a b thread, and the a thread gets stuck for various business reasons, causing the b thread to be completed first. then thread a updates the cache. At this time, suddenly other threads come in to read the data and will read the data of a, but according to the business process, they should read the data of b. At this time, there is the problem of data confusion.

1. Thread a updates the database

two。 Thread b updates the database

3. Thread b updates the cache

4. Thread a updates the cache

5. Other threads read data (misread)

At this point, we will find that it is a big problem to update the cache directly, and in many cases, in complex cache scenarios, the cache is not just a value directly extracted from the database. It may be a value calculated by combining a lot of other data.

And there may be a scenario where we often update the cache directly after updating the database, but there is no need for cache access in between, so we do a lot of useless work and pay a lot of money.

You should know something about the singleton pattern, in which there is an idea of lazy loading, that is, loading when you need it, which is very appropriate in the case of double writing, so you have the following mode of updating the database first and then deleting the cache.

two。 Update the database before deleting the cache

What's wrong with this situation?

Of course, this is still a problematic solution. Let's follow the chart.

1: thread a updates the database

2: the program failed, those that didn't come and delete the cache.

3. Other threads read the data (all wrong)

The problem with this scheme is clear at a glance. As long as the program fails, there will be data reading errors. In real business, you should read the value of a thread, but keep reading the previous value.

Has this scheme been optimized?

Of course, there is. In fact, we can record the log every time we write, and then record the log after the modification, and judge whether the write is successful by the log status.

If there is no successful follow-up and there is no new write request, write

Otherwise, no treatment will be done.

However, there will also be inconsistencies in this situation, that is, if the writing database program is broken, there will be data inconsistencies until the next time the data is recovered.

And if there are frequent writes, it is very likely that the logging mechanism does not work, there will be new data writing overwritten, and the logging system will take up additional resources.

I get it! You should first delete the cache and then update the database, so that's fine!

three。 Delete the cache before updating the database

Come on, keep mapping. Are you familiar with it?

Will there be a problem with this scheme? Of course, continue to say:

1: thread a deletes the cache

2: thread b deletes the cache

3: thread an is stuck.

4: thread b updates the database

5: thread a updates data

6: other threads read the data and read a (wrong again)

In the end, there is a problem with this situation. Whether thread a works or not, it is always you who have an accident.

In this case, there will be a piece of data confusion, but with the next update, the data will still be correct.

Is the ultimate solution to delete the cache first, then update the database, and then update the cache?

four。 Delete the cache first, then update the database, and then delete the cache

Continue mapping

1. Thread a delete cache

two。 Other threads read data and read data before a

3. Thread a updates the database

4. Thread a delete cache

5. Other threads set the cache data, which is the data before a (it should be an at this time)

Have you found out again that there will be problems with this design, and it will not be possible to restore the data correctly until the next data update?

Come on, the last delayed double deletion scheme that we often talk about, let's make a disk together.

five。 Delayed double deletion

Go on

1. Delete the cache first

two。 Write the database again

3. Dormant for a period of time (depending on the specific business time)

4. Delete the cache again

A deferred operation is added to ensure that the change cache operation for other transactions is completed before modifying the database-> before emptying the cache.

All write operations are based on the database, and as long as the cache expiration time is reached, subsequent read requests will naturally read the new value from the database and backfill the cache.

However, it is inevitable that a large number of old cache data will be queried, because the delay time is defined by the business itself, and dirty data will be queried if the time is too long or too short in the case of high concurrency.

The worst-case scenario is that there are data inconsistencies during the timeout.

At this point, I believe you have a deeper understanding of "double writing". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.