Example Analysis of redis expiration Policy and memory recovery Mechanism 07/01 Update SLTechnology News&Howtos

Example Analysis of redis expiration Policy and memory recovery Mechanism

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces the example analysis of redis expiration strategy and memory recovery mechanism, which has a certain reference value, and interested friends can refer to it. I hope you will gain a lot after reading this article.

When redis is used as a cache, the memory usage efficiency of redis is determined by the memory elimination policy. Considering this "delivery problem" given by many big companies, most people can rarely explain it clearly, unless you have some research on the real expiration strategy, lazy deletion, LRU, LRU.

1. Expiration policy

All Redis data structures can be set to expire, and when the time is up, it will be deleted automatically. Like death, keep an eye on all key with expiration time, and harvest as soon as the lifespan arrives.

Think about it from the perspective of death: will too many key expires at the same time (Redis is single-threaded, harvesting time will also take up thread processing time, if the harvest is too busy), so that you are too busy? Will it cause stutters in online read and write instructions?

1.1 expired key collections

Redis will put each key with an expiration time set into a separate dictionary, which will be traversed regularly later to delete the expired key.

Redis adopts two strategies:

Scheduled deletion is centralized processing.

Lazy deletion is piecemeal processing

1.2 timing scanning strategy

By default, Redis performs ten expiration scans per second, which does not traverse all the key in the expired dictionary, but uses a simple greedy strategy.

Random 20 key from expired dictionaries

Delete the expired key in these 20 key

If the out-of-date key ratio exceeds 1amp 4, repeat step 1

At the same time, in order to ensure that the expired scan will not cycle too much, resulting in thread jam, the algorithm also increases the upper limit of scanning time, which will not exceed 25ms by default.

What if all the key in the Redis instance (hundreds of thousands) expires at the same time?

Redis continues to scan expired dictionaries (multiple cycles) until expired key in expired dictionaries becomes sparse (the number of loops decreases significantly). Memory manager needs to recycle memory pages frequently, which will result in a certain amount of CPU consumption, which will inevitably lead to obvious stutters in online read and write requests.

When the client request arrives (if the server just enters the expired scan state), the client request will wait for at least 25ms before it will be processed. If the client sets a short timeout, such as 10ms, then a large number of links will be closed due to timeout, and a lot of exceptions will occur on the business side. And you can't see the slow query record in Redis's slowlog at this time, because slow query refers to slow logical processing and does not include waiting time.

In fact, this fault is often exposed in the community. Business developers must be careful not to all expire at the same time. They can add a random range (redis.expire_at (key, random.randint (86400) + expire_ts)) to the target expiration time to disperse the pressure of expiration processing.

1.3 Expiration policy of slave library

No expiration scan is performed from the slave library, and the handling of expiration from the slave library is passive. When the master library expires in key, a del instruction is added to the AOF file to synchronize to all slave libraries, and the slave library removes the expired key by executing this del directive.

Because instruction synchronization is carried out asynchronously, if the expired key del instructions of the master library are not synchronized to the slave database in time, there will be inconsistency between master and slave data, and the data that is not available in the master library still exists in the slave database. The loophole in the distributed lock algorithm is due to this synchronization delay.

two。 Lazy deletion

Lazy deletion (lazy free), which is checked when the client accesses key. If it expires, delete it immediately.

Why are you lazy to delete?

In fact, there is not only one main thread inside Redis, it also has several asynchronous threads dedicated to handling time-consuming operations. Deleting the instruction del frees up the memory of the object directly, and in most cases it is very fast with no significant delay.

However, if the deleted key is a very large object, then the deletion operation will lead to a single thread stutter, how to break it?

Redis version 4.0 introduces the unlink instruction (to solve this stutter problem), which lazily handles delete operations and throws them to background threads to reclaim memory asynchronously.

> unlink key

You must be worried about thread safety here. Will multiple threads concurrently modify data structures at the same time?

Let me take an example here: you can think of all the valid data in the entire Redis memory as a big tree. When the unlink instruction is issued, it simply breaks a branch in the tree and throws it into a nearby fire (asynchronous thread pool). The moment the branch leaves the tree, it can no longer be accessed by other instructions in the main thread, because the main thread will only be accessed along the tree.

2.1 Asynchronous Thread

An asynchronous thread has a special name within Redis. It is BIO, and its full name is Background IO, which means an IO thread working silently behind its back. However, memory recycling itself is not an IO operation, but CPU computing consumption may be relatively large.

The Evolution of Asynchronous Thread

When implementing lazy deletion, it doesn't think of asynchronous threads in the first place. The initial attempt is to implement progressive deletion recycling in the main thread using something similar to dictionary progressive relocation. Lazy deletion uses a method similar to scan operation, which gradually deletes the contents of the second-dimensional linked list by traversing the first-dimensional array, and then reclaims the first-dimensional array at one time when all the linked lists are recycled. This can also achieve the effect of not blocking the main thread when deleting large objects.

But it's easier said than done. Progressive recycling needs to carefully control the frequency of recycling, it can not be recycled too hard, which will lead to too much CPU resources, and can not be recycled snail slow, because memory recovery is not timely may lead to memory growth.

Antirez needs to adopt appropriate adaptive algorithm to control the recovery frequency. His first thought is to detect whether the memory growth trend is growing (+ 1) or decreasing (- 1) to gradually adjust the recovery frequency coefficient, and the implementation of such an adaptive algorithm is also very simple. But after testing, it was found that when the service was busy, QPS would drop to the normal level of 65%, which is very fatal.

That's why Antirez uses the scheme in use today-asynchronous threads. The solution of asynchronous threads is much simpler, freeing memory does not have to adapt a progressive release strategy for each data structure, nor does it have to develop an adaptive algorithm to carefully control the recovery frequency.

However, there is a price to pay for using asynchronous threads, and there is competition between main threads and asynchronous threads over the use of the memory collector (jemalloc). This competitive consumption is negligible because the main thread of Redis spends very little time on memory allocation and reclamation compared to the overall computing time.

The asynchronous threading scheme is quite complex, as can be found in the references.

2.2 flush

Redis provides flushdb and flushall instructions to empty the database, which is also an extremely slow operation.

Redis 4.0also asynchronizes these two instructions. By adding the async parameter to the instruction, you can pull up the whole tree and throw it to the background thread to burn slowly.

> flushall async

2.3 Asynchronous queue

Redis4.0, after the main thread removes the reference of the object from the "big tree", the memory recovery operation of the key will be packaged as a task and stuffed into the asynchronous task queue, and the background thread will fetch the task from this asynchronous queue. The task queue is operated by both the main thread and the asynchronous thread, so it must be a thread-safe queue.

Not all unlink operations will be delayed. If the memory consumed by the corresponding key is very small, the delayed processing is not necessary. In this case, Redis will immediately reclaim the corresponding key memory, just like the del instruction.

2.4 slow AOF Sync problem

Redis needs to synchronize AOF logs to disk once a second (configurable) to ensure that messages are not lost as far as possible, and the sync function needs to be called. This operation is time-consuming and leads to a decline in the efficiency of the main thread, so Redis also moves this operation to the asynchronous thread to complete.

The thread performing the AOF Sync operation is a separate asynchronous thread, which is not the same thread as the previous lazy delete thread, and it also has its own task queue, which is only used to store AOF Sync tasks.

2.5 more asynchronous delete points

In addition to del instructions and flush, Redis reclaim memory exists in key expiration, LRU obsolescence, rename instructions, and flush operations immediately after accepting rdb files when full synchronization from the library is completed.

Redis4.0 also brings an asynchronous deletion mechanism for these deletion points, and opening these points requires additional configuration options.

Slave-lazy-flush flush operation after receiving rdb files from the library

Eliminate lazyfree-lazy-eviction memory when it reaches maxmemory

Lazyfree-lazy-expire key expired deletion

Lazyfree-lazy-server-del rename directive to delete destKey

3. Expired phase-out configuration

When the used memory of Redis exceeds the physical memory limit, the data in memory will start to swap with the disk frequently, and the exchange will lead to a sharp decline in the performance of Redis. At this time, the access efficiency of Redis is almost tortoise (basically unavailable). This is not allowed in a production environment, and to limit the maximum memory usage, Redis provides a configuration parameter, maxmemory, to limit the memory to exceed the desired size.

What if the actual memory exceeds the maxmemory?

Redis provides several optional strategies (maxmemory-policy) to let users decide for themselves how to free up new space to continue to provide read and write services.

Noeviction does not continue to serve write requests (DEL requests can continue to serve), and read requests can continue. This ensures that data will not be lost, but it will make online business unsustainable. This is the default elimination strategy. Volatile-lru attempts to phase out the key with the expiration time set, and the least used key is eliminated first. Key that does not set an expiration time will not be eliminated, which ensures that data that needs to be persisted will not be suddenly lost. Volatile-ttl is the same as above, except that the elimination strategy is not LRU, but the value of ttl of the remaining life of key. The smaller the ttl, the more priority it is to be eliminated. The volatile-random is the same as above, but the obsolete key is the random keyallkeys-lru in the expired key collection that is different from the volatile-lru, and the key object to be eliminated in this strategy is the overall key collection, not just the expired key collection. This means that key that does not set an expiration time will also be eliminated. Allkeys-random is the same as above, but the elimination strategy is random key

Configuration in redis.conf

Maxmemory # maximum memory (in bytes)

Maxmemory-policy noeviction # default

Summary

Volatile-xxx policy will only eliminate key with expiration time.

Allkeys-xxx strategy will eliminate all key.

So how to choose?

If you just use Redis as a cache, it's best to use allkeys-xxx. The client doesn't have to carry the expiration time when writing to the cache.

If you also want to have persistence at the same time, then use the volatile-xxx strategy. The advantage is that key without setting the expiration time will not be eliminated by the LRU algorithm.

4. LRU algorithm

To implement the LRU (recent least) algorithm, in addition to the key/value dictionary, you need to attach a linked list with the elements in a certain order.

When the space is full, the element at the end of the chain table will be kicked out. When an element of the dictionary is accessed, its position in the linked list is moved to the header. So the order of the elements in the linked list is the order in which the elements were recently accessed.

4.1 approximate LRU algorithm

Redis uses an approximate LRU algorithm, which is not quite the same as LRU algorithm. The reason why we do not use the LRU algorithm is that we need to consume a lot of extra memory and make great changes to the existing data structure. The approximate LRU algorithm is very simple. Based on the existing data structure, the random sampling method is used to eliminate the elements, which is very similar to the LRU algorithm.

In order to implement the approximate LRU algorithm, Redis adds an additional small field to each key. The length of this field is 24 bit, which is the timestamp of the last access.

As mentioned earlier, key expiration methods can be divided into centralized processing and lazy processing. LRU elimination is different, it only deals with lazy processing. When Redis performs a write operation and finds that memory is out of maxmemory, a LRU elimination algorithm is executed. The algorithm is also very simple, that is, random sampling (can be configured through maxmemory-policy) to produce five key, and then eliminate the oldest key, if the memory is still out of maxmemory after elimination, then continue random sampling elimination until the memory is less than maxmemory.

The following is a comparison between the random LRU algorithm and the strict LRU algorithm: the green part is the newly added key, the dark gray part is the old key, and the light gray part is the key eliminated by the LRU algorithm. It can be seen that the larger the number of samples is, the closer the effect of the approximate LRU algorithm is to the strict LRU algorithm. At the same time, Redis3.0 adds an elimination pool to the algorithm, which further improves the effect of the approximate LRU algorithm.

The elimination pool is an array whose size is maxmemory_samples. In each phase-out cycle, the new random key list is merged with the key list in the phase-out pool. After the oldest key is eliminated, the remaining older key list is placed in the phase-out pool for the next cycle.

5. LRU

A new phase-out strategy, the LFU (Least Frequently Used) model, has been introduced in Redis 4.0.The authors think it is better than LRU. It means that it will be eliminated according to the frequency of recent visits, and it is a more accurate indication of the popularity of a key visit than LRU.

It adds two options to the phase-out policy configuration parameter maxmemory-policy, namely volatile-lfu and allkeys-lfu, which are lfu phase-out for key with expiration time and lfu phase-out algorithm for all key.

If a key is not accessed for a long time, but just accidentally visited by the user, it is not easy to be eliminated by using the LRU algorithm, because the LRU algorithm thinks that the current key is very hot. On the other hand, LFU needs to track the frequency of visits in the recent period of time. If a key is visited only occasionally, it is not enough to become very hot. It needs to be visited many times in a recent period of time before it has a chance to be considered hot.

All Redis object structure headers have a 24bit field, which is used to record the "heat" of the object.

/ / object header typedef struct redisObject {unsigned type:4; / / object types such as zset/set/hash, etc. Unsigned encoding:4; / / object coding such as ziplist/intset/skiplist, etc. Unsigned lru:24; / / object "hot" int refcount; / / reference count void * ptr; / / object body} robj;5.1 LRU mode

The lru field stores the Redis clock. The server.lruclock,Redis clock is an integer of 24bit. The default is the result of 2 ^ 24 module fetched by Unix timestamp, which is cleared once every 97 days. When a key is accessed once, the value of the lru field in its object header is updated to server.lruclock, which is always incremented. Through this logic, you can accurately calculate how long the object has not been accessed-- the idle time of the object. If it exceeds the server.lruclock, it turns back.

With the free time of the object, we can compare who is new and who is old with each other. The random LRU algorithm determines who should be eliminated by comparing the free time of the object.

The default Redis clock value is updated every millisecond and is actively set in the scheduled task serverCron. In fact, there are many scheduled tasks in serverCron, such as progressive migration of large hash tables, active elimination of expired key, trigger bgsave, bgaofrewrite, and so on.

Why does Redis cache system timestamps?

In java, we use System.currentTimeInMillis (), but Redis cannot do this, because each time to get the system timestamp is a system call, which is relatively time-consuming, which is too expensive for redis, so the acquisition time is taken directly from the cache.

5.2 LFU mode

The 24 bit in the lru field stores two values, ldt (last decrement time) and logc (logistic counter).

Logc is 8 bit, which is used to store the access frequency (the maximum integer value is 255). The storage frequency is far from enough (too small), so the 8 bit stores the logarithm of the frequency, and this value decays over time. If its value is small, it can be easily recycled. To ensure that newly created objects are not recycled, the initialization of new objects defaults to LFU_INIT_VAL=5.

Ldt is 16 bits, which is used to store the update time of the last logc (the precision can not be very high). It takes a minute timestamp to retrieve 2 ^ 16, which is returned about every 45 days. Like the LRU mode, we can use this logic to calculate the idle time of the object, but with precision of minutes.

Server.unixtime is the system timestamp of the current redis record, and like server.lruclock, it is updated every millisecond.

/ / nowInMinutes// server.unixtime is the redis cached system timestamp unsigned long LFUGetTimeInMinutes (void) {return (server.unixtime/60) & 65535;} / / idle_in_minutesunsigned long LFUTimeElapsed (unsigned long ldt) {unsigned long now = LFUGetTimeInMinutes (); if (now > = ldt) return now-ldt; / / normal comparison return 65535 color ldtlearned; / / return comparison}

The value of ldt is not updated when the object is accessed, it is updated when the Redis elimination logic is in progress, and the elimination logic is triggered only when the memory reaches the maxmemory setting, before each instruction is executed. Each elimination is a random strategy, randomly select a number of key, update the "heat" of this key, and eliminate the lowest "heat". Because Redis uses a random algorithm, if there are more key, then ldt updates may be slower. However, since it is a minute-level precision, there is no need to update it too frequently.

When ldt is updated, it also attenuates the value of logc, and there is a specific algorithm for attenuation. It divides the existing logc value minus the idle time of the object (minutes) by an attenuation coefficient, which is 1 by default, lfu_decay_time. If this value is greater than 1, it will decay more slowly. If it equals zero, it means no attenuation, and it can be configured through the configuration parameter lfu_decay_time.

The update of logc, like the lru field of LRU mode, is updated every time key is accessed, except that it is not simply + 1, but incremented by probability, because logc stores the logarithm of access count and cannot be + 1 directly.

Thank you for reading this article carefully. I hope the article "example Analysis of redis expiration Strategy and memory recovery Mechanism" shared by the editor will be helpful to you. At the same time, I also hope that you will support and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.