Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the common interview questions in Redis?

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "what are the common interview questions in Redis". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. Cache avalanche

1.1 what is a cache avalanche?

Review why we use Redis:

Why caching?

Now there is a problem, if our cache is hung up, it means that all our requests have gone to the database.

If the cache is hung up, all requests go to the database.

We all know that it is impossible for Redis to cache all the data (memory is expensive and limited), so Redis needs to set the expiration time for the data and use the two strategies of lazy deletion + periodic deletion to delete expired keys. Policy + persistence of Redis for expired keys

If the expiration time of the cached data setting is the same, and Redis happens to delete all of the data. This will cause these caches to expire at the same time, and all requests will be made to the database.

This is the cache avalanche:

Redis hung up, request all to go to the database.

Set the same expiration time for the cached data, causing the cache to expire within a certain period of time, and all requests go to the database.

If the cache avalanche happens, it is likely to bring down our database and paralyze the entire service!

1.2 how to solve the cache avalanche?

For "set the same expiration time for the cached data, causing the cache to expire within a certain period of time, and all requests go to the database." This situation is very easy to solve:

Solution: add a random value to the expiration time when caching, which will greatly reduce the cache expiration at the same time.

For the case that "Redis is dead, request all to go to the database", we can have the following ideas:

Before the incident: achieve the high availability of Redis (master-slave architecture + Sentinel or Redis Cluster) and try to avoid the occurrence of Redis hanging.

Incident: in case Redis really fails, we can set local cache (ehcache) + current limit (hystrix) to prevent our database from being killed (at least to ensure that our service can work properly).

After the incident: redis persists, automatically loads data from disk after restart, and quickly recovers cached data.

Second, cache penetration

2.1 what is cache penetration

For example, we have a database table, and ID starts with 1 (positive number):

Randomly found a database table.

But there may be hackers who want to bring down my database, and the ID requested each time is negative. This will cause my cache to be useless. All the requests have gone to the database, but the database does not have this value, so it is returned empty every time.

Cache traversal refers to querying a data that must not exist. Because the cache is not * *, and for the sake of fault tolerance, if the data cannot be found from the database, it will not be written to the cache, which will cause the non-existent data to query the database every time, thus losing the meaning of the cache.

Cache penetration

This is cache penetration:

A large number of requested data are not being cached, causing the request to go to the database.

If cache penetration occurs, it may also bring down our database and paralyze the entire service!

2.1 how to solve cache penetration?

There are also two solutions to cache traversal:

Since the requested parameter is illegal (a parameter that does not exist is requested every time), we can use a Bloom filter (BloomFilter) or compressed filter to intercept in advance. If it is illegal, we will not allow the request to go to the database layer!

When we can't find it from the database, we also set this empty object to the cache. The next time you request it, you can get it from the cache.

In this case, we usually set empty objects for a short expiration time.

Reference:

Cache series of articles-5. Cache penetration problem

Https://carlosfu.iteye.com/blog/2248185

3. Double write consistency between cache and database

3.1 for read operations, the process is like this

When talking about cache traversal, it is also mentioned above: if the data cannot be found from the database, the cache is not written.

Generally speaking, we have a fixed routine for reading operations:

If our data is in the cache, then take the cache directly.

If there is no data we want in the cache, we will first query the database, and then write the data found in the database to the cache.

* return data to the request

3.2 what is the problem of double write consistency between cache and database?

If you only query, the cached data and database data will be fine. But when we want to update? Various situations are likely to cause inconsistencies between the database and the cached data.

The inconsistency here means that the data in the database is inconsistent with the cached data.

Database and cached data are inconsistent

In theory, as long as we set the expiration time of the key, we can ensure that the data of the cache and the database are ultimately consistent. Because as long as the cached data expires, it will be deleted. When reading later, because it is not in the cache, you can check the data in the database, and then write the data found in the database to the cache.

In addition to setting the expiration time, we need to do more to avoid database inconsistencies with the cache as much as possible.

3.3 for update operations

In general, when performing an update operation, we have two options:

Manipulate the database first, then the cache

Operate the cache first, then the database

First of all, it is clear that no matter which one we choose, we want both operations to succeed or fail at the same time. So, this will become a distributed transaction problem.

So, if atomicity is destroyed, there may be the following:

The operation of the database was successful and the operation cache failed.

The operation cache was successful and the operation of the database failed.

If the * * step has failed, we can just go back to Exception and go out. The second step will not be carried out at all.

Let's analyze it in detail.

3.3.1 there are also two ways to operate caching:

Update cach

Delete cach

Generally speaking, we adopt the policy of deleting cache caching for the following reasons:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

In the high concurrency environment, whether it is operating the database first or then, if you add the update cache, it is more likely to lead to the inconsistency between the database and the cached data. (deleting the cache is much more straightforward and easier)

If the cache is updated every time the database is updated (this refers to scenarios that are updated frequently, which will cost some performance), it is better to delete it directly. When it is read again, it is not in the cache, so I will look for it in the database, find it in the database and write it to the cache (reflecting lazy loading)

Based on these two points, it is recommended to delete when caching updates!

3.3.2 when it is normal to update the database and then delete the cache:

Operate the database first, and succeed

If you delete the cache again, it will be successful.

If atomicity is destroyed:

* * if the step succeeds (manipulates the database), and the second step fails (delete the cache), it will result in new data in the database and old data in the cache.

If the * * step (manipulating the database) fails, we can directly return an error (Exception) without data inconsistency.

If in a high concurrency scenario, the probability of inconsistency between the database and the cached data is particularly low, it is not without:

The cache just expires.

Thread A queries the database and gets an old value

Thread B writes the new value to the database

Thread B deletes the cache

Thread A writes the old values found to the cache.

In order to achieve the above situation, the probability is very low:

Because this condition requires cache invalidation during read cache, and there is a write operation concurrently. In fact, the write operation of the database is much slower than the read operation, and the table is locked, and the read operation must enter the database operation before the write operation, and update the cache later than the write operation. The probability of all these conditions is basically small.

For this strategy, it is actually a design pattern: Cache Aside Pattern

Modify the database before deleting the cache

The solution to the failure to delete cache:

Send the key to be deleted to the message queue

Consume messages and get the key that needs to be deleted

Keep retrying the delete operation until it succeeds

3.3.3 delete the cache first, and then update the database. Normally, it looks like this:

Delete cache first, successful

If you update the database again, it will be successful.

If atomicity is destroyed:

* * step succeeds (delete cache), and the second step fails (update database). The database and cached data are still the same.

If the * * step (delete cache) fails, we can directly return an error (Exception), and the database and cached data are still the same.

It looks great, but when we analyze it in a concurrency scenario, we know that there is still a problem:

Thread A deleted the cache

Thread B queried and found that the cache no longer exists

Thread B goes to the database to query to get the old value

Thread B writes the old value to the cache

Thread A writes the new value to the database

Therefore, it can also lead to database and cache inconsistencies.

The idea of solving the inconsistency between database and cache under concurrency:

Delete the cache, modify the database, read the cache and other operations backlog into the queue to achieve serialization.

Backlog operations into the queue

3.4 compare the two strategies

We can find that the two strategies have their own advantages and disadvantages:

Delete the cache before updating the database

Perform unsatisfactorily under high concurrency and perform well when atomicity is destroyed

Update the database before deleting the cache (Cache Aside Pattern design pattern)

Excellent performance under high concurrency and unsatisfactory performance when atomicity is destroyed.

3.5 other schemes and materials to ensure data consistency

You can update it with databus or Ali's canal listening binlog.

This is the end of the content of "what are the common interview questions for Redis". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report