Analysis of dictionary, hash algorithm and ReHash principle in Redis 07/06 Update SLTechnology News&Howtos

Analysis of dictionary, hash algorithm and ReHash principle in Redis

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article introduces the knowledge of "analyzing dictionaries, hashing algorithms and ReHash principles in Redis". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Dictionaries in Redis are widely used to implement various functions of Redis, including databases and hash keys.

The underlying implementation of the dictionary is a hash table, and each dictionary has two hash tables, one for normal use and the other for rehash expansion.

The structure of the dictionary defines the typedef struct dict {/ / type-specific function dictType * type; / / private data void * privdata; / / hash table, the index subscript recorded with two elements dictht ht [2] / / rehash. When there is no rehash, the value is-1 int rehashidx;} dict.

= = when doing rehash, rehashidx migrates the entry data of each index.

Where the structure of the hash table dictht is defined as:

Typedef struct dictht {/ / Hash table array dictEntry * * table; / / Hash table size unsigned long size; / / Hash table size mask, which is used to calculate the index value unsigned long sizenask; / / the number of nodes in the hash table unsigned long uesd;} dictht

Where table is an array, each element of the array points to a pointer to the dictEntry type, and the dictEntry type holds a key-value pair.

It can also be seen here that the nodes of the hash table are linked to solve the hash conflict problem, that is, the chain address method.

Hash conflict and Hash algorithm

For quick access from key to value, Redis uses a hash table to hold all key-value pairs. The key corresponds to the Key set by Redis, and the value corresponds not to the value itself, but to the pointer to the specific value. The biggest advantage of using a hash table is that you can find key-value pairs quickly with O (1) time complexity. But since it is a hash table, there must be the problem of hash conflict.

A hash conflict means that when the hash value of two key and the hash bucket are calculated, they fall on the same hash bucket.

Redis solves hash conflicts by using chained hashes, that is, zippers. When multiple elements point to the same hash bucket, a linked list is used to store the corresponding data in the same hash bucket, and they are connected by pointers in turn.

Hash algorithm

When adding a new key-value pair to the dictionary, the program needs to first calculate the hash value and index value according to the key-value pair, and then put the hash table node containing the new key-value pair on the specified index of the hash table array according to the index value.

ReHash process

There is a load factor (load factor) in the hash table to control the number of key-value pairs saved in the hash table. This requires a rehash (re-hashing) operation to complete. Among them, the calculation formula of load factor is:

/ / load factor = number of nodes saved in the hash table / hash table size load_factor = ht [0] .used / ht [0] .size

The conditions for the expansion and contraction of the hash table are as follows:

The server is not currently executing BGSAVE or BGREWRITEAOF commands, and the load factor of the hash table is greater than or equal to 1

The server is currently executing BGSAVE or BGREWRITEAOF commands, and the load factor of the hash table is greater than or equal to 5

If one of the above conditions is met, the rehash process will be executed.

If the server is executing BGSAVE or BGREWRITEAOF, Redis creates a child of the current server process

The process of rehash is roughly divided into three steps:

Allocate more space to hash table 2, for example, twice as much as current hash table 1

Remap and copy the data from hash table 1 to hash table 2

Free up space in hash table 1

The size of the space allocated in the first step is determined by the current type of rehash operation and the number of key-value pairs in the current hash table.

When an extension operation is performed, the space allocated is the first 2 ^ n value greater than or equal to (the number of key-value pairs of the hash table * 2).

Assuming that the current number of key-value pairs is 4, then 4 * 2 = 8, because 8 is exactly equal to 2 ^ 3, that is, just equal to the first value equal to 2 ^ n, so the expansion space is 8.

If a shrink operation is performed, the allocated space size is the first 2 ^ n value greater than or equal to (the number of key-value pairs in the hash table)

Progressive reHash

When there are a large number of hash tables, if the data is copied all at once, it is likely to have an impact on the server. So Redis is divided into multiple rehash, that is, progressive rehash.

To put it simply, in the second step, Redis still processes client requests normally, starting from the first index location in hash table 1, copying all the entries elements in that index position into hash table 2; and then copying the entries of the next index location on the next request.

In this way, the overhead of a large number of copies at one time is cleverly allocated to the process of processing requests many times, which avoids time-consuming operations and ensures fast access to data.

Hash table operations during rehash

When performing progressive rehash operations, dictionary deletion, lookup, update, and so on are performed in two hash tables. For example, if you want to find a key in a dictionary, you will first query the original table, and if you can't find it, you will query the new table.

On the other hand, the addition of the dictionary will only be saved in the new table.

This is the end of "analyzing dictionaries, hashing algorithms and ReHash principles in Redis". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.