What are the six underlying data structures of Redis 04/22 Update SLTechnology News&Howtos

What are the six underlying data structures of Redis

2025-04-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article introduces the knowledge of "what are the six underlying data structures of Redis". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. Simple dynamic string (SDS)

Although Redis is written in C language, Redis does not directly use the traditional string representation of C language (an array of characters ending with the empty character'\ 0'). Second, it constructs an abstract type called simple dynamic string (simple dynamic string,SDS) and represents SDS as the default string of Redis.

In Redis, the C string is only used as the literal amount of the string (string literal) in places where there is no need to modify the string value, such as printing logs.

Definition of SDS:

Struct sdshdr {/ / record the number of bytes used in the buf array / is equal to the length of the string saved by SDS int len; / / record the number of unused bytes in the buf array int free; / / byte array, used to hold the string char buf [];}

The value of the ① free property is 0, indicating that the SDS does not allocate any unused space.

The value of the ② len property is 5, indicating that the SDS holds a five-byte string.

The ③ buf attribute is an array of type char. The first five bytes of the array hold'R','e',

There are five characters'd','i' and's', while the last byte holds the empty character'\ 0'.

(SDS follows the convention that C strings end with null characters, 1 byte of space for storing empty characters is not counted in the len attribute of SDS, and operations such as allocating extra 1 byte space for empty characters and adding empty characters to the end of the string are done automatically by the SDS function, all of which are completely transparent to SDS users. The advantage of following the convention of null character endings is that SDS can directly reuse functions in the C string library. )

The difference between SDS and C floating string:

(1) constant complexity to obtain string length

Because the C string does not record its own length information, in order to get the length of a C string, the program must traverse the entire string and count each character encountered until it encounters an empty character representing the end of the string, which is O (N).

Unlike the C string, because SDS records the length of the SDS itself in the len property, the complexity of getting a SDS length is only O (1).

(2) eliminate buffer overflow

We know that using the strcat function to concatenate two strings in C will cause a buffer overflow if the memory space is not allocated long enough. For SDS data type, when making character modification, it will first check whether the memory space meets the requirements according to the len attribute of the record. If not, it will expand the memory space accordingly, and then modify it, so there will be no buffer overflow.

(3) reduce the number of memory reallocations for modified strings

The C language does not record the length of the string, so if you want to modify the string, you must redistribute the memory (release before applying), because if there is no redistribution, the increase in the length of the string will cause a memory buffer overflow. Decreasing the length of the string will cause a memory leak.

For SDS, due to the existence of len attribute and free attribute, two strategies of space pre-allocation and inert space release are implemented for modifying string SDS:

1, space pre-allocation: when expanding the space of a string, the expanded memory is more than actually needed, which can reduce the number of memory reallocations required to perform string growth operations continuously.

2, inert space release: when shortening a string, the program does not immediately use memory reallocation to recycle the shortened excess bytes, but uses the free attribute to record the number of these bytes for subsequent use. (of course, SDS also provides the corresponding API, and we can also release the unused space manually when we need it.)

(4) binary security

Because the C string uses an empty character as the identification of the end of the string, and for some binary files (such as pictures, etc.), the content may include an empty string, so the C string cannot be accessed correctly; and the API of all SDS deals with the elements in buf in a binary way, and SDS does not use an empty string to determine whether it ends, but by the length represented by the len attribute.

(5) compatible with some C string functions

Although SDS's API is binary-safe, they also follow the convention that C strings end with null characters: these API always set the end of data saved by SDS to a null character, and always allocate an extra byte to accommodate this empty character when allocating space for the buf array, so that SDS that holds text data can reuse some of the library-defined functions.

(6) Summary

2. Linked list

As a common data structure, linked lists are built into many high-level programming languages. Because the C language used by Redis does not have such a data structure, Redis builds its own linked list structure.

Each linked list node is represented by an adlist.h/listNode structure:

Typedef struct listNode {/ / Front node struct listNode * prev; / / Post node struct listNode * next; / / value of the node void * value;} listNode

Multiple listNode can form a double-ended linked list with prev and next pointers, as shown in figure 3-1.

Although linked lists can be formed using only multiple listNode structures, it is more convenient to use adlist.h/list to hold linked lists:

Typedef struct list {/ / header node listNode * head; / / number of nodes contained in the footer node listNode * tail; / / Node value copy function void * (* dup) (void * ptr); / / Node value release function void (* free) (void * ptr) / / Node value comparison function int (* match) (void * ptr, void * key);} list

The list structure provides a header pointer head, a footer pointer tail, and a linked list length counter len for linked lists, while dup, free, and match members are type-specific functions required to implement polymorphic linked lists:

The ① dup function is used to copy the values saved by the linked list node

The ② free function is used to release the values saved by the linked list node

The ③ match function is used to compare whether the value saved by the linked list node is equal to another input value.

Figure 3-2 is a linked list of one list structure and three listNode structures:

The features of Redis's linked list implementation can be summarized as follows:

① double-ended: the linked list node has prev and next pointers, and the complexity of getting the front and back nodes of a node is O (1).

② acyclic: both the prev pointer of the header node and the next pointer of the footer node point to NULL, and access to the linked list ends with NULL.

③ with header pointer and footer pointer: through the head pointer and tail pointer of list structure, the complexity of the program to obtain the header node and footer node of the linked list is O (1).

④ with linked list length counter: the program uses the len attribute of the list structure to count the linked list nodes held by list. The complexity of getting the number of nodes in the linked list is O (1).

⑤ polymorphism: linked list nodes use void* pointers to save node values, and can set type-specific functions for node values through the dup, free, and match attributes of the list structure, so linked lists can be used to save different types of values.

3. Dictionary

A dictionary, also known as a symbol table or associative array, or map, is an abstract data structure used to hold key-value pairs. Each key key in the dictionary is unique, and the value can be found or modified through key. There is no built-in implementation of this data structure in C, so the dictionary is still built by Redis itself.

Redis's dictionary uses a hash table as the underlying implementation. There can be multiple hash table nodes in a hash table, and each hash table node holds a key-value pair in the dictionary.

Hash table

The hash table used by the Redis dictionary is defined by the dict.h/dictht structure:

Typedef struct dictht {/ / Hash table array dictEntry * * table; / / Hash table size unsigned long size; / / Hash table size mask, which is used to calculate the index value / / is always equal to size-1 unsigned long sizemask; / / the number of nodes already in the hash table unsigned long used;} dictht

Figure 4-1 shows an empty hash table of size 4 (without any key-value pairs).

Hash table node

Hash table nodes are represented by dictEntry structures, and each dictEntry structure holds a key-value pair:

Typedef struct dictEntry {/ / key void * key; / / value union {void * val; uint64_t u64; int64_t s64;} v; / / point to the next hash table node to form a linked list struct dictEntry * next;} dictEntry

The key is used to save the key, and the val property is used to hold the value, which can be a pointer, a uint64_t integer, or an int64_t integer.

Notice that there is also a pointer to the next hash table node. We know that the biggest problem with the hash table is the existence of hash conflicts, how to solve the hash conflicts, there are open address method and chain address method. What is used here is the chain address method, through the pointer next, multiple key-value pairs with the same hash value can be connected together to solve the hash conflict.

Because the linked list made up of dictEntry nodes does not have a pointer to the end of the linked list, for speed reasons, the program always adds a new node to the header of the linked list (complexity is O (1)), ranking ahead of other existing nodes. )

Dictionaries

Dictionaries in Redis are represented by the dict.h/dict structure:

Typedef struct dict {/ / type-specific function dictType * type; / / private data void * privdata; / / hash table dictht ht [2]; / / rehash index / / when rehash is not in progress, the value is-1 int rehashidx; / * rehashing not in progress if rehashidx = =-1 * /} dict

The type property and the privdata property are set for different types of key-value pairs to create a polymorphic dictionary:

The ● type property is a pointer to dictType structures, each of which holds a cluster of functions for manipulating specific type key-value pairs, and Redis sets different type-specific functions for dictionaries for different purposes.

The ● and privdata properties hold the optional parameters that need to be passed to those type-specific functions.

Typedef struct dictType {/ / function for calculating hash value unsigned int (* hashFunction) (const void * key); / / function for copying keys void * (* keyDup) (void * privdata, const void * key); / / function for copying values void * (* valDup) (void * privdata, const void * obj); / / function for comparing keys int (* keyCompare) (void * privdata, const void * key1, const void * key2) / / void (* keyDestructor) (void * privdata, void * key) for destroying keys; / / void (* valDestructor) (void * privdata, void * obj) for destroying values;} dictType

The ht attribute is an array of two items, and each item in the array is a dictht hash table. In general, dictionaries only use the ht [0] hash table, and the ht [1] hash table is only used when rehash the ht [0] hash table.

In addition to ht [1], another attribute related to rehash is rehashidx: it records the current progress of rehash and has a value of-1 if no rehash is currently in progress.

Figure 4-3 shows a dictionary in normal state (without rehash):

① hash algorithm: Redis calculates hash and index values as follows:

# use the hash function set in the dictionary to calculate the hash value of the key key hash = dict- > type- > hashFunction (key); # use the sizemask attribute and hash value of the hash table to calculate the index value # depending on the situation, ht [x] can be ht [0] or ht [1] index = hash & dict- > ht [x] .sizemask

② solves hash conflicts: we introduced this problem above, using the chain address method. Point to the next hash table node with the same index value through the * next pointer in the dictionary.

③ expansion and contraction (rehash): when the hash table holds too many or too few key-value pairs, it is necessary to expand or shrink the hash table through rerehash (re-hashing). Specific steps:

1. Allocate space for the dictionary's ht [1] hash table, which depends on the operation to be performed and the number of key-value pairs currently contained in ht [0] (that is, the value of the ht [0] .used attribute)

If ● performs an extension operation, then the size of ht [1] is the first one greater than or equal to ht [0] .used * 2n; (that is, each extension creates another hash table based on a doubling of the space already used by the original hash table)

If ● performs a shrink operation, each contraction creates a new hash table based on a doubling of the space used.

2. Rehash all key-value pairs saved in ht [0] to ht [1]: rehash means to recalculate the hash and index values of the key, and then place the key-value pair on the specified location in the ht [1] hash table.

3. When all the key-value pairs contained in ht [0] are migrated to ht [1] (ht [0] becomes an empty table), release ht [0], set ht [1] to ht [0], and create a new blank hash table in ht [1] in preparation for the next rehash.

Conditions for ④ to trigger capacity expansion and contraction:

1. The server does not execute BGSAVE command or BGREWRITEAOF command at present, and the load factor is greater than or equal to 1.

2. The server is currently executing BGSAVE or BGREWRITEAOF commands, and the load factor is greater than or equal to 5.

Ps: load factor = number of hash table saved nodes / hash table size.

3. On the other hand, when the load factor of the hash table is less than 0.1, the program automatically starts to shrink the hash table.

⑤ asymptotic rehash

What is progressive rehash? In other words, the expansion and contraction operations are not completed at one time and centrally, but are completed several times and gradually. If there are only a few dozens of key-value pairs saved in Redis, then the rehash operation can be completed instantly, but if there are millions, tens of millions or even hundreds of millions of key-value pairs, rehash at one time will inevitably cause Redis to be unable to do other operations for a period of time. Therefore, Redis adopts progressive rehash, so that during progressive rehash, operations such as dictionary deletion, lookup and update may be carried out on two hash tables, and if the first hash table is not found, it will go to the second hash table for search. But the addition operation must be done on the new hash table.

4. Jump table

Jump table (skiplist) is an ordered data structure, which can quickly access nodes by maintaining multiple pointers to other nodes in each node.

It has the following properties:

1. It is composed of many layers.

2. Each layer is an ordered linked list, arranged from the top to the bottom, and contains at least two linked list nodes, the front head node and the back nil node.

3. The lowest linked list contains all the elements

4. If an element appears in the linked list of a certain layer, then all the linked lists below that layer will also appear (the element of the previous layer is a subset of the elements of the current layer)

5. Each node in the linked list contains two pointers, one to the next linked list node in the same layer, and the other to the same linked list node in the next layer.

Unlike linked lists, dictionaries and other data structures that are widely used within Redis, Redis only uses jump tables in two places, one is to implement ordered collection keys, and the other is used as an internal data structure in cluster nodes. In addition, jump tables have no other use in Redis.

Jump Table Node (zskiplistNode)

The structure contains the following attributes:

① layer (level): each layer of the node is marked with the words L1, L2, L3, etc., L1 for the first layer, L2 for the second layer, and so on. Each layer has two properties: forward pointer and span. The forward pointer is used to access other nodes at the end of the table, while the span records the distance between the node the forward pointer points to and the current node. In the picture above, the arrow with a number on the line represents the forward pointer, and that number is the span. As the program traverses from the header to the footer, the access occurs along the forward pointer of the layer.

Each time a new jump table node is created, the program randomly generates a value between 1 and 32 as the size of the level array according to the power law (power law, the larger the number, the smaller the probability of occurrence). This size is the "height" of the layer. (so the height of the header node is 32)

② backward pointer: a backward pointer in a node that marks a node with the word BW, which points to the previous node at the current node. The back pointer is used when the program traverses from the footer to the header.

③ score (score): 1.0,2.0 and 3.0 in each node are the scores saved by the node. In the jump table, the nodes are arranged from small to large according to their saved scores.

④ member objects (obj): o1, o2, and o3 in each node are the member objects saved by the node.

Note that the header node is constructed the same as other nodes: the header node also has fallback pointers, scores, and member objects, but these attributes of the header node are not used, so these parts are omitted in the figure. Only the layers of the header node are shown.

Jump table (zskiplist)

① header: points to the header node of the jump table.

② tail: points to the footer node of the jump table.

③ level: record the number of layers of the node with the largest number of layers in the current jump table (the number of layers of the header node is not counted).

④ length: records the length of the jump table, that is, the number of nodes currently contained in the jump table (header nodes are not counted).

5. Set of integers

The set of integers (intset) is one of the underlying implementations of the collection key. When a collection contains only integer-valued elements and the number of elements in the collection is small, Redis will use the collection as the underlying implementation of the collection key.

A set of integers (intset) is a collection abstract data type used by Redis to hold integer values. It can save integer values of type int16_t, int32_t, or int64_t, and ensure that there are no duplicate elements in the collection.

Each intset.h/intset structure represents a collection of integers:

Typedef struct intset {/ / Encoding uint32_t encoding; / / the number of elements contained in the collection uint32_t length; / / an array of saved elements int8_t contents [];} intset

Each element of the set of integers is a data item of the contents array, arranged from smallest to largest, and does not contain any duplicates.

The length property records the size of the contents array.

It is important to note that although the contents array is declared to be of type int8_t, the contents array does not actually hold any values of type int8_t, and its true type is determined by encoding.

① upgrade (encoding int16_t-> int32_t-> int64_t)

When our new element type is larger than the length of the original collection element type, we need to upgrade the integer collection to put the new element into the integer collection. Specific steps:

1. Expand the size of the underlying array of the set of integers according to the new element type, and allocate space for the new element.

2. Convert all the existing elements of the underlying array to elements of the same type as the new elements, and put the converted elements in the correct position. In the process of placement, the order of the whole elements is maintained.

3. Add new elements to the set of integers (ensure order).

Upgrading can greatly save memory.

② downgrade

The set of integers does not support degraded operations, and once the array is upgraded, the encoding remains in the upgraded state.

6. Compress the list

Compressed list (ziplist) is one of the underlying implementations of list keys and hash keys.

When a list key contains only a small number of list items, and each list item is either a small integer value or a short string, Redis uses a compressed list as the underlying implementation of the list key.

Because all the keys and values contained in the hash key are small integer values or short strings.

Compressed list is a sequential (sequential) data structure developed by Redis to save memory and consists of a series of specially encoded consecutive memory blocks.

A compressed list can contain any number of nodes (entry), each of which can hold an array of bytes or an integer value.

Each compressed list node consists of three parts: previous_entry_length, encoding and content.

① previous_entry_ength: records the length of the previous byte of the compressed list. The length of the previous_entry_ength may be 1 byte or 5 bytes. If the length of the previous node is less than 254, the node only needs one byte to represent the length of the previous node. If the length of the previous node is greater than or equal to 254, the first byte of the attribute is 254, followed by four bytes to represent the length of the previous node of the current node. Using this principle, that is, the current node position minus the length of the previous node, the starting position of the previous node is obtained, and the compressed list can be traversed from the tail to the head. This effectively reduces the waste of memory.

② encoding: the encoding of a node stores the content type and length of the node's content. There are two encoding types, one is a byte array, the other is an integer, and the length of the encoding region is 1 byte, 2 bytes, or 5 bytes long.

The ③ content:content area is used to hold the content of the node, and the content type and length of the node are determined by the encoding.

Chain renewal problem

In the worst case, chained updates need to perform N space reallocation operations on the compressed list, and the worst complexity of each space reallocation is O (N), so the worst complexity of chained updates is O (N ^ 2).

This is the end of the content of "what are the six underlying data structures of Redis". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.