In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article focuses on "in high concurrency scenarios, update the cache or update the database first", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "in a high concurrency scenario, update the cache or update the database first."
In large systems, the cache mechanism is usually introduced to reduce the pressure on the database. Once the cache is introduced, it is easy to cause inconsistency between the cache and the database data, resulting in users seeing the old data.
In order to reduce data inconsistencies, the mechanism for updating caches and databases is particularly important.
Cache aside
Cache aside, also known as bypass caching, is a commonly used caching strategy.
(1) Common process of read request
Cache aside read request
The application will first judge whether the cache has the data, the cache hits and returns the data directly, and if the cache misses, the cache penetrates to the database, queries the data from the database, then writes back to the cache, and finally returns the data to the client.
(2) Common process of writing request
Cache aside write request
First update the database, and then delete the data from the cache.
After looking at the picture of the write request, some students may ask: why delete the cache and update it directly? There are several pits involved here, and we will step down step by step.
Cache aside stepped on the pit
If you use the Cache aside strategy incorrectly, you will encounter a deep pit. Let's step on it one by one.
Step on pit one: update the database first, then update the cache
If there are two write requests that need to update data at the same time, each write request updates the database first and then the cache, and data inconsistencies may occur in concurrent scenarios.
Update the database before updating the cache
As shown in the above figure, the execution process:
(1) write request 1 to update the database and update the age field to 18
(2) write request 2 to update the database and update the age field to 20
(3) write request 2 updates the cache, and the cache age is set to 20
(4) write request 1 to update the cache, and the cache age is set to 18
The expected result after execution is that the database age is 20, the cache age is 20, and the result cache age is 18, which causes the cached data to be not up-to-date and dirty data appears.
Step on pit 2: delete the cache before updating the database
If the processing flow of a write request is to delete the cache and then update the database, data inconsistencies may occur in the concurrency scenario of a read request and a write request.
Delete the cache before updating the database
As shown in the above figure, the execution process:
(1) write request to delete cached data
(2) read request query cache missed (Hit Miss), and then query the database and write back the returned data to the cache
(3) write request to update the database.
In the whole process, it is found that the age in the database is 20, the age in the cache is 18, the cache and database data are inconsistent, and dirty data appears in the cache.
Step on pit 3: update the database before deleting the cache
In a practical system, it is recommended to update the database and then delete the cache for write requests, but there are still problems in theory, as illustrated by the following example.
Update the database before deleting the cache
As shown in the above figure, the execution process:
(1) query the cache first for the read request, and the cache misses, and the query database returns data.
(2) write request to update database and delete cache
(3) read request write back cache
After the operation of the whole process, it is found that the database age is 20 and the cache age is 18, that is, the database is inconsistent with the cache, resulting in the data read by the application from the cache are all old data.
But when we think about it, the probability of these problems is actually very low, because database update operations usually take several orders of magnitude more time than memory operations, and the last step in the figure above is very fast to write back to the cache (set age 18), which is usually done before updating the database.
What if this extreme scenario occurs? We have to come up with a solution: cache data to set the expiration time. Usually in the system, it is possible to allow a small amount of data to be inconsistent for a short time.
Read through
In the Cache Aside update mode, the application code needs to maintain two data sources: one is the cache and the other is the database. Under the Read-Through strategy, the application does not need to manage the cache and database, but simply delegates the synchronization of the database to the cache provider Cache Provider. All data interactions are done through the abstract cache layer.
Read-Through process
As shown in the figure above, the application only needs to interact with Cache Provider, regardless of whether it is accessed from cache or database.
When doing a large number of reads, Read-Through reduces the load on the data source and is resilient to the failure of caching services. If the cache service fails, the cache provider can still operate by going directly to the data source.
Read-Through is suitable for scenarios where the same data is requested multiple times, which is very similar to the Cache-Aside policy, but there are still some differences between the two. Let's emphasize again:
In Cache-Aside, the application is responsible for fetching data from the data source and updating it to the cache.
In Read-Through, this logic is usually supported by a separate cache provider (Cache Provider).
Write through
Under the Write-Through policy, the cache provider Cache Provider is responsible for updating the underlying data source and cache when a data update (Write) occurs.
The cache is consistent with the data source and always reaches the data source through the abstract cache layer when writing.
Cache Provider is similar to the role of an agent.
Write-Through process
Write behind
Write behind is also known as Write back in some places, which simply means that when the application updates the data, it only updates the cache, and Cache Provider flushes the data to the database at regular intervals. To put it bluntly, delayed writing.
Write behind process
As shown in the figure above, the application updates two pieces of data, and the Cache Provider is written to the cache immediately, but it is not written in batches to the database until after a while.
This approach has both advantages and disadvantages:
The advantage is that the data writing speed is very fast, which is suitable for frequent writing scenarios.
The disadvantage is that the cache and database are not strongly consistent, and the systems that require high consistency should be used with caution.
Summary
After learning so much, I believe everyone has a clear understanding of the cache update strategy. Finally, I would like to summarize a little bit.
There are three main strategies for caching updates:
Cache aside
Read/Write through
Write behind
Cache aside usually updates the database before deleting the cache, and the data is usually set for caching time in order to keep track of it.
Read/Write through is generally provided by a Cache Provider to read and write operations, the application does not need to be aware of whether the operation is the cache or the database.
The simple understanding of Write behind is to delay writes, and Cache Provider will enter the database in batches at regular intervals. The advantage is that the application writes very fast.
At this point, I believe you have a deeper understanding of "in a high concurrency scenario, update the cache or update the database first". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.