Detailed explanation of data structure and coding in redis 04/25 Update SLTechnology News&Howtos

Detailed explanation of data structure and coding in redis

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Data structure and encoding in redis:

Background:

1 > redis internally uses the redisObject structure to define the stored value object. 2 > each type has at least two internal encodings, and Redis decides which encoding to implement based on the type and length of the current value. 3 > Encoding type conversion is completed automatically when Redis writes data. The conversion process is irreversible, and the conversion rules can only be converted from small memory encoding to large memory encoding.

Source code:

Value object redisObject:

Typedef struct redisObject {

Unsigned type:4; / * object type * /

Unsigned encoding:4; / * Internal coding * /

Unsigned lru:LRU_BITS; / * lru time (relative to server.lruclock) * /

Int refcount; / * reference counter, based on which the memory recovery mechanism is implemented * /

Void * ptr; / * if you want to store an integer value, store the data directly, otherwise it means a pointer to the data * /

} robj

Type type:

Description: view the type of current key: type key

# define OBJ_STRING 0 / * string object * /

# define OBJ_LIST 1 / * list object * /

# define OBJ_SET 2 / * Collection object * /

# define OBJ_ZSET 3 / * ordered collection objects * /

# define OBJ_HASH 4 / * Hash object * /

Coding encoding

Description: view the encoding of the current key: object encoding key

# define OBJ_ENCODING_RAW 0 / * Raw representation simple dynamic string * /

# define OBJ_ENCODING_INT 1 / * Encoded as integer long long type integer * /

# define OBJ_ENCODING_HT 2 / * Encoded as hash table dictionary * /

# define OBJ_ENCODING_ZIPMAP 3 / * Encoded as zipmap Compression map*/

# define OBJ_ENCODING_LINKEDLIST 4 / * Encoded as regular linked list double-ended linked list * /

# define OBJ_ENCODING_ZIPLIST 5 / * Encoded as ziplist compressed list * /

# define OBJ_ENCODING_INTSET 6 / * set of Encoded as intset integers * /

# define OBJ_ENCODING_SKIPLIST 7 / * Encoded as skiplist Jump Table * /

# define OBJ_ENCODING_EMBSTR 8 / * simple dynamic string encoded by Embedded sds string encoding embstr * /

# define OBJ_ENCODING_QUICKLIST 9 / * Fast table based on compressed list with two ends * /

Last visited time lru:

Concept: record when the object was last accessed.

Description:

1 > check the idle time of the current key (this command does not update the lru field); object idletime key. You can use scan + object idletime key to collect data that has not been accessed for a long time, and then clean it manually.

2 > when maxmemory, maxmemory-policy=volatile-lru or allkeys-lru are configured, if the memory exceeds the upper limit (maxmemory), the data that has not been accessed for a long time will be reclaimed first, thus the memory will be reclaimed.

Reference counter refcount:

Concept: record the number of times the current object is referenced, and when refcount=0, you can safely reclaim the current object space.

Description: get the current object reference: object refcount key

The encoding corresponding to the type:

String:

Int: a string that holds the integer value.

Embstr: a short string (no more than 44 bytes in size) that holds characters.

Raw: a long string (no more than 44 bytes in size) that holds characters.

Comparison of embstr and raw:

Raw calls the memory allocation function twice, and of course it needs to be released twice when it is released.

Embstr calls the memory allocation function once, allocates a continuous piece of memory, and only needs to release it once.

List (list):

Compressed list (ziplist):

Structure: all data are linear continuous memory structure (roughly comparable array), the purpose is to reduce the footprint of memory and pursue the balance of space and time.

1 > join and leave the team with O (1) time complexity.

2 > read and write operations involve complex pointer movement, and the worst time complexity is O (N2), so it is not easy to have too many elements in the list.

3 > the new deletion operation involves memory reallocation, which increases the complexity of the operation.

Advantages: it takes up less memory and occupies a continuous piece of memory, so the loading speed is relatively faster.

Disadvantages: when the number of elements is large, the time to access elements is longer.

Application:

Suitable for storing small objects and data of limited length (even if the complexity of O (N2) is not too large).

When the number of elements is less than list-max-ziplist-entries (default 512) and all element values are smaller than list-max-ziplist-value (default 64 bytes), ziplist is used as the internal implementation of the list.

Double-ended linked list (linkedlist):

Advantages: when there are a large number of elements, the time to access elements is faster than that of compressed lists.

Disadvantages: because it is a two-way linked list, it maintains the structure of front pointer, post pointer and so on, which takes up more memory, and the memory is not continuous, so it is easy to produce memory fragments.

Description: when the conditions of ziplist cannot be met, use linkedlist as the internal implementation of the list.

Application: when there are more elements in the list object, the compressed list will be transformed into a double-ended linked list that is more suitable for storing a large number of elements.

Note: only small memory encoding can be converted to large memory encoding. (if the conversion of data to compression coding consumes CPU when elements are added or deleted frequently, the loss outweighs the gain.)

Quick list (quicklist):

Structure: a two-way linked list, each node of the linked list is a ziplist, so quicklist combines the advantages of two-way linked list and compressed list.

Starting with Redis3.2, the list is encoded in quicklist.

Hash (hash):

Compressed list (ziplist):

Application: when the number of elements is less than hash-max-ziplist-entries (default 512) and the size of all elements value is less than hash-max-ziplist-value (default 64 bytes), ziplist is used as the internal implementation of hash.

Hash table (hashtable):

Advantages: read and write time complexity O (1)

Disadvantages: take up a lot of memory.

Application: when the conditions of ziplist cannot be met, hashtable is implemented as an internal hash.

Hash algorithm: similar to the traditional hash algorithm, it calculates the position in the hash table according to key, uses a single linked list to resolve the conflict, expands when the loading factor is reached, and then causes a heavy hash.

Rehash: use incremental heavy hash:

Concept: during capacity expansion, all key will not be rehash at once, but rehash operations of key will be delayed to other operations (hash table lookup, update, delete).

Advantage: avoid the problem that the service is temporarily unresponsive due to a large number of key performing rehash operations at the same time.

Procedure: in the incremental rehash process, two hash tables are used:

Look up: first from the old table, and then from the new table, in addition to some key rehash operations.

New: new key-value pairs are added to the new table.

Collection (set):

Set of integers (intset):

Structure: an ordered, non-repeating set of integers.

1 > search time complexity is O (logn)

2 > insert time complexity is O (n)

Advantages: the memory consumption is much smaller than that of hashtable

Application: when all elements are integers and the number of elements is less than set-max-intset-entries (default 512), use intset as the internal implementation of the collection.

Hash table (hashtable): use hashtable as the internal implementation of the collection when the conditions of intset cannot be met.

Ordered set (zset):

Description: redis sets a score for each element in an ordered set as the basis for sorting.

Compressed list (ziplist):

Application: when the number of elements is less than zset-max-ziplist-entries (default 128) and the value of each element is less than zset-max-ziplist-value (default 64 bytes), ziplist is used as the internal implementation of the ordered set.

Jump table (skiplist):

Structure: the jump table achieves fast access by maintaining multiple pointers to other nodes in each node (based on layer, span, etc.).

The average search time complexity is O (logn) and the worst O (n).

Application: when the ziplist condition is not met, skiplist is used as the internal implementation.

Memory optimization:

Scenario: there is a large amount of data with relatively small key and value, so how to save memory in redis.

Principle: reduce memory consumption by significantly reducing the number of key.

Implementation: through grouping, the client maps a large amount of key to a set of hash objects according to a certain policy. Because the value is small, objects of hash type will use ziplist encoding with less memory.

Eg: if there are 1 million keys, you can map to 1000 hash, and each hash holds 1000 elements.

These are the details of the data structure and coding in redis. For more information about the data structure and coding in redis, please pay attention to other related articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.