Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the expiration deletion strategy and memory obsolescence mechanism in Redis

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains "what is the expired deletion strategy and memory elimination mechanism in Redis". The content in the article is simple and clear, and it is easy to learn and understand. Please follow Xiaobian's train of thought to study and learn "what is the expired deletion strategy and memory elimination mechanism in Redis".

Expired deletion Policy of key in Redis

There are three policies for expired deletion in Redis.

1. Delete regularly

While setting the expiration time of a key, we create a timer that allows the timer to delete it immediately when the expiration time comes.

Advantages:

By using a timer, you can ensure that the expired key can be deleted as soon as possible and release the memory occupied by the expired key.

Disadvantages:

It is unfriendly to CPU. When there are many expired keys, deleting expired key will take up a considerable part of CPU resources, which will affect the response time and throughput of the server.

2. Lazy deletion

Lazy deletion, when a key-value pair expires, delete the key-value pair only when the key-value pair is used again, that is, if it is not needed, the key-value pair will always exist.

Advantages:

CPU is friendly, and expiration checks are performed only when key-value pairs are taken out, so that CPU resources are not spent on expired deletions of other unimportant key-value pairs.

Disadvantages:

If some key-value pairs are never used again, they will not be deleted, resulting in a memory leak, and useless junk data takes up a lot of resources, but the server cannot delete it.

Look at the source code.

/ / https://github.com/redis/redis/blob/6.2/src/db.c#L1541// this function is called when the key is accessed, because some key may still exist in memory / / key is still valid, and the function returns a value of 0, otherwise, if the key expires, the function returns 1. Int expireIfNeeded (redisDb * db, robj * key) {/ / No expired if (! keyIsExpired (db,key)) return 0 / / the expiration of the slave database is controlled by the master database and will not be deleted. / / the above has determined whether it has expired, so the key here must be an expired key, but if the key created by the master node is not deleted, it will only return expired if (server.masterhost! = NULL) return 1 ... / * Delete the key * / / Delete key deleteExpiredKeyAndPropagate (db,key); return 1;}

You can see that the corresponding key for each operation checks whether the key expires, and if it expires, the corresponding key is deleted.

If the expired key is created by the master library, then checking from the library will not delete it, but will return the expired or unexpired status based on the expiration time of the key.

3. Delete regularly

Periodic deletion is an integration and compromise of the above two deletion strategies.

Sample and check some key every period of time to see if they are out of date, and delete them if they are out of date.

1. Sample a certain number of key, the number of samples can be configured, and delete all expired key

2. If the percentage of expired key exceeds the percentage of acceptable expired key, the process of deletion is repeated until the percentage of expired key falls below the percentage of acceptable expired key.

Advantages:

Regular deletion, by controlling the length and frequency of periodic deletion, can reduce the impact of deletion on CPU and reduce the waste of memory caused by expired keys.

Disadvantages:

The frequency of execution is not easy to control.

Too fast frequency is not friendly to CPU. If it is too slow, it will be unfriendly to memory. Expired key-value pairs cannot be deleted in time.

At the same time, if a key-value pair expires but is not deleted, and the business obtains the key-value pair again, then the deleted data will be obtained, which must be unreasonable.

Take a look at the source code implementation

/ / https://github.com/redis/redis/blob/6.2/src/server.c#L1853// this function handles the "background" operations we need to perform incrementally in the Redis database, such as active key expiration, resizing, and heavy hashing. Void databasesCron (void) {/ / expires by random sampling / / here distinguishes the processing of master node and slave node if (server.active_expire_enabled) {if (iAmMaster ()) {activeExpireCycle (ACTIVE_EXPIRE_CYCLE_SLOW);} else {expireSlaveKeys () }.} / / https://github.com/redis/redis/blob/6.2/src/expire.c#L113void activeExpireCycle (int type) {/ / adjust the running parameters according to the configured overtime work. The default workload is 1, and the maximum configurable workload is 10 unsigned long effort = server.active_expire_effort-1, / * Rescale from 0 to 9. * / / the number of sampled key config_keys_per_loop = ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP + ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP/4*effort, / / percentage of CPU time, default is 25%, maximum 43% If it is 100%, nothing can be done except for regular deletion, so limit config_cycle_slow_time_perc = ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC + 2*effort, / / acceptable percentage of expired key config_cycle_acceptable_stale = ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE- effort / / the execution time for slow periodic deletions timelimit = config_cycle_slow_time_perc*1000000/server.hz/100; timelimit_exit = 0;. / / accumulate some global statistics when key expires in order to understand the number of key that has logically expired but still exists in the database long total_sampled = 0; long total_expired = 0; for (j = 0; j)

< dbs_per_call && timelimit_exit == 0; j++) { ... // 如果超过 config_cycle_acceptable_stale 的key过期了,则重复删除的过程,直到过期key的比例降至 config_cycle_acceptable_stale 以下。 // 存储在 config_cycle_acceptable_stale 中的百分比不是固定的,而是取决于Redis配置的"expire efforce" do { /* If there is nothing to expire try next DB ASAP. */ if ((num = dictSize(db->

Expires)) = 0) {db- > avg_ttl = 0; break;}. / / the number of key sampled if (num > config_keys_per_loop) num = config_keys_per_loop;. While (sampled

< num && checked_buckets < max_buckets) { for (int table = 0; table < 2; table++) { ... while(de) { /* Get the next entry now since this entry may get * deleted. */ dictEntry *e = de; de = de->

Next; ttl = dictGetSignedIntegerVal (e)-now; / / Expiration check and delete if (activeExpireCycleTryExpire (db,e,now)) expired++;...}} db- > expires_cursor++ }. / / determine whether the proportion of expired key is greater than config_cycle_acceptable_stale, if it is greater than the deletion of expired key} while (sampled = = 0 | | (expired*100/sampled) > config_cycle_acceptable_stale) }.} / / check to delete the key void expireSlaveKeys (void) with expiration time created by the slave node {/ / synchronize the key from the master library, the expiration time is maintained by the master library, and the master library synchronizes the DEL operation to the slave library. / / if the slave library is in READ-WRITE mode, it can continue to write data. Data written from the library itself needs to maintain its out-of-date operations. If (slaveKeysWithExpire = = NULL | | dictSize (slaveKeysWithExpire) = = 0) return;.}

Lazy deletion process

1. Perform a periodic deletion at a fixed time

2. Sample a certain number of key, the number of samples can be configured, and delete all expired key

3. If the percentage of expired key exceeds the percentage of acceptable expired key, the process of deletion is repeated until the percentage of expired key falls below the acceptable percentage of expired key.

4. Expired key created from the library cannot also be deleted from the library.

Expired deletion policy in Redis

All three strategies discussed above have more or less problems. The actual strategy adopted in Redis is a combination of lazy deletions and periodic deletions.

The use of combination mode

Delete regularly to obtain the balance between CPU and memory usage. The expired KEY may not be deleted in time. When the KEY is obtained again, do another expiration check through lazy deletion to prevent the business from getting expired content.

Whether the slave library will dirty read the expired key created by the master library

From the above lazy deletion and regular deletion of the source code, we can find that the slave library can not actively delete the expired keys of the master library. If the expired key-value pair created by a master library has expired and the master library does not delete it in time when it is deleted regularly, the key-value pair is requested from the library when lazy deletion is performed. Because the key-value pair created by the master library cannot be deleted from the library at this time, does it mean that expired data will be read from the library?

The answer is definitely not.

How-redis-replication-deals-with-expires-on-keys

How Redis replication deals with expires on keys

Redis expires allow keys to have a limited time to live. Such a feature depends on the ability of an instance to count the time, however Redis slaves correctly replicate keys with expires, even when such keys are altered using Lua scripts.

To implement such a feature Redis cannot rely on the ability of the master and slave to have synchronized clocks, since this is a problem that cannot be solved and would result into race conditions and diverging data sets, so Redis uses three main techniques in order to make the replication of expired keys able to work:

1.Slaves don't expire keys, instead they wait for masters to expire the keys. When a master expires a key (or evict it because of LRU), it synthesizes a DEL command which is transmitted to all the slaves.

2.However because of master-driven expire, sometimes slaves may still have in memory keys that are already logically expired, since the master was not able to provide the DEL command in time. In order to deal with that the slave uses its logical clock in order to report that a key does not exist only for read operations that don't violate the consistency of the data set (as new commands from the master will arrive). In this way slaves avoid to report logically expired keys are still existing. In practical terms, an HTML fragments cache that uses slaves to scale will avoid returning items that are already older than the desired time to live.

3.During Lua scripts executions no keys expires are performed. As a Lua script runs, conceptually the time in the master is frozen, so that a given key will either exist or not for all the time the script runs. This prevents keys to expire in the middle of a script, and is needed in order to send the same script to the slave in a way that is guaranteed to have the same effects in the data set.

Once a slave is promoted to a master it will start to expire keys independently, and will not require any help from its old master.

The above is a description of the problem in the official document.

It roughly means that the slave node will not actively delete the expired key, and the slave node will wait for the master node to trigger the expiration of the key. When the trigger key of the master node expires, the master node synchronizes a del command to all slave nodes.

Because it is deleted by the master node, the slave node will get the expired key-value pair. The slave node needs to judge whether the impairment expires according to its own local logical clock, so as to achieve the consistent read operation of the data set.

We know that the expiration policy in Redis is lazy deletion and periodic deletion, so each key operation will use lazy deletion to check whether it expires, and then determine whether it can be deleted.

/ / https://github.com/redis/redis/blob/6.2/src/db.c#L1541// this function is called when the key is accessed, because some key may still exist in memory / / key is still valid, and the function returns a value of 0, otherwise, if the key expires, the function returns 1. Int expireIfNeeded (redisDb * db, robj * key) {/ / check whether key expires if (! keyIsExpired (db,key)) return 0 / / the expiration of the slave library is controlled by the master database and will not be deleted. / / the above has determined whether it has expired, so the key here must design an expired key, but if the key created by the master node is not deleted, it will only return expired if (server.masterhost! = NULL) return 1 . / * Delete the key * / / delete key deleteExpiredKeyAndPropagate (db,key); return 1;} / / https://github.com/redis/redis/blob/6.2/src/db.c#L1485/* Check if the key is expired. * / int keyIsExpired (redisDb * db, robj * key) {/ / expiration time mstime_t when = getExpire (db,key); mstime_t now; / / No expired if (when)

< 0) return 0; /* No expire for this key */ /* Don't expire anything while loading. It will be done later. */ if (server.loading) return 0; // lua 脚本执行的过程中不过期 if (server.lua_caller) { now = server.lua_time_snapshot; } // 如果我们正在执行一个命令,我们仍然希望使用一个不会改变的引用时间:在这种情况下,我们只使用缓存的时间,我们在每次调用call()函数之前更新。 // 这样我们就避免了RPOPLPUSH之类的命令,这些命令可能会重新打开相同的键多次,如果下次调用会看到键过期,则会使已经打开的对象在下次调用中失效,而第一次调用没有。 else if (server.fixed_time_expire >

0) {now = server.mstime;} / / in other cases, get the latest time else {now = mstime ();} / determine whether the return now > when;} / / returns the expiration time of the specified key, and if it does not expire, return-1long long getExpire (redisDb * db, robj * key) {dictEntry * de; / * No expire? Return ASAP * / if (dictSize (db- > expires) = = 0 | (de = dictFind (db- > expires,key- > ptr)) = = NULL) return-1; / * The entry was found in the expire dict, this means it should also * be present in the main dict (safety check). * / serverAssertWithInfo (NULL,key,dictFind (db- > dict,key- > ptr)! = NULL); return dictGetSignedIntegerVal (de);}

With regard to the lazy deletion above, although the expired key created by the master node cannot be deleted, the expiration time can be determined, so if the expired key created by the master database is not deleted in time, the slave library can use lazy deletion to determine whether the key-value pair is expired to avoid reading the expired content.

Memory elimination mechanism

The Redis expiration policy we discussed above refers to which policy Redis uses to delete expired key-value pairs. However, there are some key that will never be used in the future, so it may not be deleted, and in the process of using Redis, with the increase of writing data, there is not enough memory in Redis, so you need the memory elimination strategy of Redis.

Redis expiration policy refers to which policy Redis uses to delete expired key-value pairs.

The Redis memory elimination mechanism refers to what strategy will be adopted to delete qualified key-value pairs when the running memory of Redis has exceeded the maximum memory set by Redis, so as to ensure the efficient operation of Redis.

Maximum memory triggered by memory elimination

The memory elimination algorithm will be triggered only if the memory in Redis reaches the threshold. This threshold is the maximum running memory we set. In the configuration file redis.conf, the parameter maxmemory is used to set it.

Query maximum running memory

127.0.0.1 6379 > config get maxmemory1) "maxmemory" 2) "0"

In a 64-bit operating system, when maxmemory is 0, it indicates a 32-bit system with no memory limit.

What are the memory elimination strategies?

When the existing memory is larger than maxmemory, redis will be triggered to actively phase out memory. By setting maxmemory-policy, there are several ways to eliminate memory:

1. Volatile-lru: eliminate the longest unused key values with expiration time set

2. Allkeys-lru: eliminate the longest unused keys in the whole key value

3. Volatile-random: random elimination of any key value with expiration time set

4. Allkeys-random: randomly eliminate any key value

5. Volatile-ttl: priority is given to eliminating keys that expire earlier.

6. Noeviction: no data will be eliminated. When there is insufficient memory, an error will be reported in the new operation. Redis default memory elimination policy.

Where allkeys-xxx means to eliminate data from all key values, and volatile-xxx means to eliminate data from key values with expired keys set.

Memory elimination algorithm

In addition to random deletion and no deletion, there are two main elimination algorithms: LRU algorithm and LFU algorithm.

LRU

LRU, whose full name is Least Recently Used translated as the least used recently, is a commonly used page replacement algorithm that selects the most recently unused pages to be eliminated.

The implementation of the general LRU algorithm is based on the linked list structure. The elements in the linked list are arranged from the back to the back according to the order of operation, and the keys of the latest operation will be moved to the header. When memory elimination is needed, only the elements at the end of the linked list need to be deleted.

Redis uses an approximate LRU algorithm to better save memory. It is implemented by adding an additional field to the existing data structure to record the last access time of this key value. When Redis memory is eliminated, random sampling will be used to eliminate the data, which is randomly selected 5 values (this value can be configured), and then eliminate the longest unused one.

Let's take a look at how it is realized.

Redis uses a redisObject structure to hold a pointer to the value in the source code for the value in each key-value pair. Let's take a look at the structure of redisObject first.

/ / https://github.com/redis/redis/blob/6.2/src/server.h#L673typedef struct redisObject {unsigned type:4; unsigned encoding:4; / / here save / / LRU time (relative to the global LRU clock) / / LFU data (lower 8 bits as a counter, use the high 16 bits in 24 bits to record the access timestamp) unsigned lru:LRU_BITS / * LRU time (relative to global lru_clock) or * LFU data (least significant 8 bits frequency * and most significant 16 bits access time). * / int refcount; void * ptr;} robj

When a key-value pair is created, the update time is recorded

/ / https://github.com/redis/redis/blob/6.2/src/object.c#L41 robj * createObject (int type, void * ptr) {robj * o = zmalloc (sizeof (* o)); o-> type = type; o-> encoding = OBJ_ENCODING_RAW; o-> ptr = ptr; o-> refcount = 1 / / if the cache replacement policy is LFU, set the lru variable to the count value of LFU if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {o-> lru = (LFUGetTimeInMinutes () dict,key- > ptr); if (de) {robj * val = dictGetVal (de); / * Update the access time for the ageing algorithm. * Don't do it if we have a saving child, as this will trigger * a copy on write madness. * / if (! hasActiveChildProcess () & &! (flags & LOOKUP_NOTOUCH)) {if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {updateLFU (val);} else {/ / time to update lru using LRU val- > lru = LRU_CLOCK ();}} return val } else {return NULL;}}

We looked at the code above to create and access a key-value pair. Each operation, the lru time recorded in redisObject will be updated synchronously.

Redis will judge the current memory usage. If the value of maxmemory configuration is exceeded, the new memory will be obsolete.

If the memory exceeds the value of maxmemory, you also need to calculate the amount of memory that needs to be freed, which is equal to the amount of memory used minus maxmemory. However, the amount of memory used does not include the size of the replication buffer used for master-slave replication.

/ / https://github.com/redis/redis/blob/6.2/src/evict.c#L512int performEvictions (void) {. While (mem_freed

< (long long)mem_tofree) { int j, k, i; static unsigned int next_db = 0; sds bestkey = NULL; int bestdbid; redisDb *db; dict *dict; dictEntry *de; if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) || server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL) { struct evictionPoolEntry *pool = EvictionPoolLRU; while(bestkey == NULL) { unsigned long total_keys = 0, keys; /* We don't want to make local-db choices when expiring keys, * so to start populate the eviction pool sampling keys from * every DB. */ // 根据淘汰策略,决定使用全局哈希表还是设置了过期时间的key的哈希表 for (i = 0; i < server.dbnum; i++) { db = server.db+i; dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ? db->

Dict: db- > expires; if ((keys = dictSize (dict))! = 0) {/ / pass the selected hash table dict to the evictionPoolPopulate function, and also pass the global hash table to the evictionPoolPopulate function evictionPoolPopulate (I, dict, db- > dict, pool); total_keys + = keys }.}. / / is used to populate evictionPool// insert keys in ascending order, so keys with less free time are on the left and keys with high idle time are on the right. / / https://github.com/redis/redis/blob/6.2/src/evict.c#L145void evictionPoolPopulate (int dbid, dict * sampledict, dict * keydict, struct evictionPoolEntry * pool) {int j, k, count; dictEntry * samples [server.maxmemory _ samples]; count = dictGetSomeKeys (sampledict,samples,server.maxmemory_samples); for (j = 0; j)

< count; j++) { ... // 将元素插入池中。 首先,找到第一个空闲时间小于我们空闲时间的空桶或第一个填充的桶。 k = 0; while (k < EVPOOL_SIZE && pool[k].key && pool[k].idle < idle) k++; if (k == 0 && pool[EVPOOL_SIZE-1].key != NULL) { /* Can't insert if the element is < the worst element we have * and there are no empty buckets. */ continue; } else if (k < EVPOOL_SIZE && pool[k].key == NULL) { /* Inserting into empty position. No setup needed before insert. */ } else { /* Inserting in the middle. Now k points to the first element * greater than the element to insert. */ if (pool[EVPOOL_SIZE-1].key == NULL) { /* Free space on the right? Insert at k shifting * all the elements from k to end to the right. */ /* Save SDS before overwriting. */ sds cached = pool[EVPOOL_SIZE-1].cached; memmove(pool+k+1,pool+k, sizeof(pool[0])*(EVPOOL_SIZE-k-1)); pool[k].cached = cached; } else { /* No free space on right? Insert at k-1 */ k--; /* Shift all elements on the left of k (included) to the * left, so we discard the element with smaller idle time. */ sds cached = pool[0].cached; /* Save SDS before overwriting. */ if (pool[0].key != pool[0].cached) sdsfree(pool[0].key); memmove(pool,pool+1,sizeof(pool[0])*k); pool[k].cached = cached; } } ... }} 处理淘汰的数据,Redis 中提供了一个数组 EvictionPoolLRU,用来保存待淘汰的候选键值对。这个数组的元素类型是 evictionPoolEntry 结构体,该结构体保存了待淘汰键值对的空闲时间 idle、对应的 key 等信息。 可以看到上面的上面会选取一定的过期键,然后插入到 EvictionPoolLRU 中 dictGetSomeKeys 函数采样的 key 的数量,是由 redis.conf 中的配置项 maxmemory-samples 决定的,该配置项的默认值是 5 // https://github.com/redis/redis/blob/6.2/src/evict.c#L55struct evictionPoolEntry { // 待淘汰的键值对的空闲时间 unsigned long long idle; /* Object idle time (inverse frequency for LFU) */ // 待淘汰的键值对的key sds key; /* Key name. */ // 缓存的SDS对象 sds cached; /* Cached SDS object for key name. */ // 待淘汰键值对的key所在的数据库ID int dbid; /* Key DB number. */};static struct evictionPoolEntry *EvictionPoolLRU; 然后通过 evictionPoolPopulate 函数,进行采样,然后将采样数据写入到 EvictionPoolLRU 中,插入到 EvictionPoolLRU 中的数据是按照空闲时间从小到大进行排好序的 freeMemoryIfNeeded 函数会遍历一次 EvictionPoolLRU 数组,从数组的最后一个 key 开始选择,如果选到的 key 不是空值,那么就把它作为最终淘汰的 key。 // https://github.com/redis/redis/blob/6.2/src/evict.c#L512int performEvictions(void) { if (!isSafeToPerformEvictions()) return EVICT_OK; int keys_freed = 0; size_t mem_reported, mem_tofree; long long mem_freed; /* May be negative */ mstime_t latency, eviction_latency; long long delta; int slaves = listLength(server.slaves); int result = EVICT_FAIL; if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK) return EVICT_OK; ... while (mem_freed < (long long)mem_tofree) { if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) || server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL) { struct evictionPoolEntry *pool = EvictionPoolLRU; while(bestkey == NULL) { unsigned long total_keys = 0, keys; ... /* Go backward from best to worst element to evict. */ // 从数组最后一个key开始查找 for (k = EVPOOL_SIZE-1; k >

= 0; k key -) {/ / the current key is null, then find the next key if (pool [k] .key = = NULL) continue; bestdbid = pool [k] .dbid; / / get the key-value pair corresponding to the current key from the global hash table or the expire hash table If (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {de = dictFind (server.b [pool [k] .dbid] .dict, pool [k] .key) } else {de = dictFind (server.db[ Pool [k] .dbid] .expires, Pool [k] .key);} / * Remove the entry from the pool. * / / remove the current key from the EvictionPoolLRU array if (pool [k] .key! = pool [k] .cached) sdsfree (pool [k] .key); pool [k] .key = NULL; pool [k] .pool = 0 / * If the key exists, is our pick. Otherwise it is * a ghost and we need to try the next element. * / if the key-value pair corresponding to the current key is not empty, select the current key as the eliminated key if (de) {bestkey = dictGetKey (de); break;} else {/ * Ghost... Iterate again. * /}. / * Finally remove the selected key. * / if (bestkey) {db = server.db+bestdbid; robj * keyobj = createStringObject (bestkey,sdslen (bestkey)); propagateExpire (db,keyobj,server.lazyfree_lazy_eviction); delta = (long long) zmalloc_used_memory (); latencyStartMonitor (eviction_latency) / / lazily delete if (server.lazyfree_lazy_eviction) dbAsyncDelete (db,keyobj); else / / synchronously delete dbSyncDelete (db,keyobj);.}}.}

Each time a part of the expired key-value pair is selected, and the one that has not been used for the longest time is eliminated each time. If the free memory space is not enough, the process of sampling and deleting will be repeated.

LFU

In addition to the LRU algorithm, Redis introduced the LFU algorithm, which is the least frequently used (Least Frequently Used,LFU) algorithm, in version 4.0.

LRU algorithm: eliminate the least recently used data, it selects the elements to be eliminated according to the time dimension, that is, delete the elements that have not been accessed for the longest time.

LFU algorithm: eliminates the least frequently accessed data. It selects the elements to be eliminated according to the frequency dimension, that is, deletes the elements with the lowest access frequency. If two elements are accessed with the same frequency, the elements that have not been accessed for the longest time are eliminated.

The basic principle of LFU

The LFU (Least Frequently Used) algorithm, that is, the least access algorithm, eliminates data according to the historical frequency of the access cache. The core idea is that "if the data is accessed very few times in the past, then the probability of being accessed in the future will be very low."

It selects the elements to be eliminated according to the frequency dimension, that is, to delete the elements with the lowest access frequency. If two elements are accessed with the same frequency, the elements that have not been accessed for the longest time are eliminated. In other words, when LFU is eliminated, it will select two dimensions, first compare the frequency, and select the element with the least access frequency; if the frequency is the same, the oldest element will be eliminated according to the time dimension.

For the implementation of LUF, please refer to the detailed explanation of LFU implementation.

Take a look at the implementation of the LFU algorithm in Redis

1. Record and update the access frequency of key-value pairs

When analyzing LRU above, we talked about how redisObject,Redis uses a redisObject structure to hold pointers to values for values in each key-value pair in the source code. The lru:LRU_BITS field records the time and counter required by the LRU algorithm and the LFU algorithm.

/ / https://github.com/redis/redis/blob/6.2/src/server.h#L673typedef struct redisObject {unsigned type:4; unsigned encoding:4; / / here save / / LRU time (relative to the global LRU clock) / / LFU data (lower 8 bits as a counter, use the high 16 bits in 24 bits to record the access timestamp) unsigned lru:LRU_BITS / * LRU time (relative to global lru_clock) or * LFU data (least significant 8 bits frequency * and most significant 16 bits access time). * / int refcount; void * ptr;} robj

When a key-value pair is created, if the LFU algorithm is used, the access timestamp and access times of the key-value pair recorded in the lru field are updated.

/ / https://github.com/redis/redis/blob/6.2/src/object.c#L41 robj * createObject (int type, void * ptr) {robj * o = zmalloc (sizeof (* o)); o-> type = type; o-> encoding = OBJ_ENCODING_RAW; o-> ptr = ptr; o-> refcount = 1 / / if the cache replacement policy is LFU,lru variable including UNIX timestamp in minutes and access times 5 if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {o-> lru = (LFUGetTimeInMinutes () dict,key- > ptr); if (de) {robj * val = dictGetVal (de) / / when using the LFU algorithm, call the updateLFU function to update the access frequency if (! hasActiveChildProcess () & &! (flags & LOOKUP_NOTOUCH)) {if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {updateLFU (val);} else {/ / the time to update lru using LRU val- > lru = LRU_CLOCK () }} return val;} else {return NULL;}} / https://github.com/redis/redis/blob/6.2/src/db.c#L54/* updates LFU when accessing an object. * first, if the decrement time is reached, the decrement counter. * then logarithmically increments the counter and updates the access time. * / void updateLFU (robj * val) {unsigned long counter = LFUDecrAndReturn (val); counter = LFULogIncr (counter); val- > lru = (LFUGetTimeInMinutes () > 8; / / get current visits unsigned long counter = o-> lru & 255; unsigned long num_periods = server.lfu_decay_time? LFUTimeElapsed (ldt) / server.lfu_decay_time: 0; if (num_periods) / / if the attenuation is less than the current number of visits, then the number of attenuated visits is the current number of visits minus the attenuation Otherwise, the number of attenuated visits is equal to 0 counter = (num_periods > counter)? 0: counter-num_periods; / / if the attenuation is 0, the original number of visits is returned return counter;}

As you can see in the above code, when accessing a key-value pair, the number of visits is first attenuated?

The LFU algorithm eliminates data according to the frequency of access, not just the number of visits. The longer the access interval, the lower the access frequency.

Because Redis uses the number of visits in the lru variable to represent the access frequency, the number of visits is attenuated by the LFUDecrAndReturn function each time the access frequency of the key-value pair is updated.

The LFUDecrAndReturn function calls the LFUTimeElapsed function (in the evict.c file) to calculate the elapsed time since the last access to the key-value pair. This length of time is also calculated with an accuracy of 1 minute. With the length of time from the last visit, the LFUDecrAndReturn function divides the length of time by the value of lfu_decay_time and takes the result as the attenuation of the number of visits.

The value of the lfu_decay_time variable is determined by the configuration item lfu-decay-time in the redis.conf file. When Redis initializes, it sets the value of the lfu_decay_time variable through the initServerConfig function, and the default value is 1. Therefore, by default, the attenuation of the number of visits is equal to the current number of minutes from the last access.

After attenuation, let's take a look at how to update the number of visits.

/ / https://github.com/redis/redis/blob/6.2/src/evict.c#L298uint8_t LFULogIncr (uint8_t counter) {/ / equal to 255. update if (counter = = 255) return 255; / / calculate a random number double r = (double) rand () / RAND_MAX; / / calculate the difference between the current number of visits and the initial value double baseval = counter-LFU_INIT_VAL If (baseval

< 0) baseval = 0; // 根据baseval和lfu_log_factor计算阈值p double p = 1.0/(baseval*server.lfu_log_factor+1); // 概率值小于阀值 if (r < p) counter++; return counter;} 如果当前访问次数小于255的时候,每次 LFULogIncr 函数会计算一个阈值 p,以及一个取值为 0 到 1 之间的随机概率值 r。如果概率 r 小于阈值 p,那么 LFULogIncr 函数才会将访问次数加 1。否则的话,LFULogIncr 函数会返回当前的访问次数,不做更新。 这样按照一定的概率增加访问频率,避免了访问次数过大,8 bits 计数器对访问次数的影响。 2、使用 LFU 算法淘汰数据 LFU 处理数据淘汰和 LRU 方式差不多,这里回顾下 LRU 处理数据淘汰的过程 1、调用 getMaxmemoryState 函数计算待释放的内存空间; 2、调用 evictionPoolPopulate 函数随机采样键值对,并插入到待淘汰集合 EvictionPoolLRU 中; 3、遍历待淘汰集合 EvictionPoolLRU,选择实际被淘汰数据,并删除。 不同的是,LFU 算法在淘汰数据时,在第二步的 evictionPoolPopulate 函数中,使用了不同的方法来计算每个待淘汰键值对的空闲时间。 LRU 中 idle 记录的是它距离上次访问的空闲时间。 LFU 中 idle 记录的是用 255 减去键值对的访问次数。也就是键值对访问次数越大,它的 idle 值就越小,反之 idle 值越大。 if (server.maxmemory_policy & MAXMEMORY_FLAG_LRU) { idle = estimateObjectIdleTime(o); } else if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) { idle = 255-LFUDecrAndReturn(o); } freeMemoryIfNeeded 函数按照 idle 值从大到小,遍历 EvictionPoolLRU 数组,选择实际被淘汰的键值对时,它就能选出访问次数小的键值对了,也就是把访问频率低的键值对淘汰出去。 具体的源码上面 LRU 已经展示了,这里不在啰嗦了。 为什么数据删除后内存占用还是很高 Redis 中的内存可能会遇到这样一种情况,虽然进行了数据的删除,据量已经不大了,但是使用 top 命令,发现 Redis 还是会占用大量的内存 因为,当数据删除后,Redis 释放的内存空间会由内存分配器管理,并不会立即返回给操作系统。所以,操作系统仍然会记录着给 Redis 分配了大量内存。 但是这些内存可能是不连续的,对于不连续的小内存块,虽然是空闲内存,但是 Redis 缺不能拿来用,会造成资源的浪费。 为什么会产生内存碎片呢? 内存碎片如何产生 1、内存分配器的分配策略 内存分配器对于内存的分配,一般是按固定大小来分配内存,而不是完全按照应用程序申请的内存空间大小给程序分配。 Redis 可以使用 libc、jemalloc、tcmalloc 多种内存分配器来分配内存,默认使用 jemalloc。 jemalloc 的分配策略之一,是按照一系列固定的大小划分内存空间,例如8字节、16字节、32字节、48字节,…, 2KB、4KB、8KB等。当程序申请的内存最接近某个固定值时,jemalloc会给它分配相应大小的空间。 这样的分配方式本身是为了减少分配次数。例如,Redis申请一个20字节的空间保存数据,jemalloc 就会分配 32 字节,此时,如果应用还要写入 10 字节的数据,Redis 就不用再向操作系统申请空间了,因为刚才分配的32字节已经够用了,这就避免了一次分配操作。 减少了内存分配的次数,缺点就是增加了产生内存碎片的可能。 2、键值对的删除更改操作 Redis 中键值对会被修改和删除,这会导致空间的扩容和释放,一方面,如果修改后的键值对变大或变小了,就需要占用额外的空间或者释放不用的空间。另一方面,删除的键值对就不再需要内存空间了,此时,就会把空间释放出来,形成空闲空间。 Redis中的值删除的时候,并没有把内存直接释放,交还给操作系统,而是交给了Redis内部有内存管理器。 Redis 中申请内存的时候,也是先看自己的内存管理器中是否有足够的内存可用。Redis的这种机制,提高了内存的使用率,但是会使 Redis 中有部分自己没在用,却不释放的内存,导致了内存碎片的发生。 碎片率的意义 mem_fragmentation_ratio的不同值,说明不同的情况。 大于1:说明内存有碎片,一般在1到1.5之间是正常的; 大于1.5:说明内存碎片率比较大,需要考虑是否要进行内存碎片清理,要引起重视; 小于1:说明已经开始使用交换内存,也就是使用硬盘了,正常的内存不够用了,需要考虑是否要进行内存的扩容。 可以使用 INFO memory 命令查看内存碎片率 127.0.0.1:6379>

INFO memory# Memoryused_memory:865672used_memory_human:845.38Kused_memory_rss:8085504used_memory_rss_human:7.71Mused_memory_peak:865672used_memory_peak_human:845.38Kused_memory_peak_perc:100.01%used_memory_overhead:819226used_memory_startup:802056used_memory_dataset:46446used_memory_dataset_perc:73.01%allocator_allocated:995552allocator_active:1282048allocator_resident:3690496total_system_memory:1929736192total_system_memory_human:1.80Gused_memory _ lua:37888used_memory_lua_human:37.00Kused_memory_scripts:0used_memory_scripts_human:0Bnumber_of_cached_scripts:0maxmemory:0maxmemory_human:0Bmaxmemory_policy:noevictionallocator_frag_ratio:1.29allocator_frag_bytes:286496allocator_rss_ratio:2.88allocator_rss_bytes:2408448rss_overhead_ratio:2.19rss_overhead_bytes:4395008mem_fragmentation_ratio:9.80mem_fragmentation_bytes:7260856mem_not_counted_for_evict:0mem_replication_backlog:0mem_clients _ slaves:0mem_clients_normal:16986mem_aof_buffer:0mem_allocator:jemalloc-5.1.0active_defrag_running:0lazyfree_pending_objects:0

Mem_fragmentation_ratio represents the memory fragmentation rate.

Mem_fragmentation_ratio = used_memory_rss/ used_memory

Used_memory_rss is the physical memory space that the operating system actually allocates to Redis, which contains fragments, while used_memory is the space that Redis actually applies for to save data.

How to clean up memory fragments

After the Redis server is restarted, Redis will return the useless memory to the operating system, and the fragmentation rate will be reduced.

Version 4. 0 of Redis introduces automatic memory fragmentation.

Automatic defragmentation, memory will be cleaned automatically as long as the following configuration is set.

Config set activedefrag yes

However, for the specific time to start, it is controlled by the following two parameters, as long as one of them is not satisfied, the automatic cleaning will be stopped.

Active-defrag-ignore-bytes 100mb: starts cleaning when the number of bytes of memory fragmentation reaches 100MB

Active-defrag-threshold-lower 10: when memory fragmentation accounts for 10% of the total space allocated to Redis by the operating system, cleanup begins.

In order to ensure the impact on CPU in the cleaning process, two parameters are set to control the upper and lower limits of the proportion of CPU time occupied by the cleaning operation, which not only ensures that the cleaning work can be carried out normally, but also avoids the reduction of Redis performance.

Active-defrag-cycle-min 25: indicates that the proportion of CPU time used in the automatic cleaning process is not less than 25% to ensure that the cleaning can be carried out normally.

Active-defrag-cycle-max 75: indicates that the proportion of CPU time used in the automatic cleaning process is not more than 75%. Once the time is exceeded, the cleaning will be stopped, so as to avoid a large number of memory copies blocking the Redis during cleaning, resulting in increased response latency. 、

If you are not satisfied with the results of automatic cleanup, you can use the following command to directly try manual debris cleanup:

Memory purge thank you for reading, the above is "what is the expired deletion strategy and memory elimination mechanism in Redis". After the study of this article, I believe you have a deeper understanding of what the expired deletion strategy and memory elimination mechanism in Redis is, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report