Explore the basic data structure of Redis design and implementation of 1:Redis 07/09 Update SLTechnology News&Howtos

Explore the basic data structure of Redis design and implementation of 1:Redis

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

This article is from the Internet.

This series of articles will be sorted out in my Java interview Guide warehouse on GitHub. Please check out more wonderful content in my warehouse.

Https://github.com/h3pl/Java-Tutorial

Have some trouble with Star if you like.

The article was first posted on my personal blog:

Www.how2playlife.com

This article is one of "exploring the Design and implementation of Java" on the official account of Wechat [Redis Technology]. Part of the content of this article comes from the network. In order to explain the topic of this article clearly and thoroughly, and integrate a lot of technical blog content that I think is good, quote some good blog articles, if there is any infringement, please contact the author.

This series of blog posts will show you how to start to advanced, the basic usage of Redis, the basic data structure of Redis, and some advanced usage. At the same time, you also need to further understand the underlying data structure of Redis. Then, it will also bring Redis master-slave replication, clustering, distributed locks and other related content, as well as some usage and precautions as a cache. So that you can have a more complete understanding of the whole Redis-related technology system and form your own knowledge framework.

If you have any suggestions or questions about this series of articles, you can also follow the official account [Java Technology jianghu] to contact the author. You are welcome to participate in the creation and revision of this series of blog posts.

Start learning Redis this week to see how Redis is implemented. So I will write a series of articles about Redis. This article is about the basic data of Redis. By reading this article, you can understand:

Dynamic string (SDS) linked list dictionary

How to implement the three data structures Redis.

SDS

SDS (Simple Dynamic String) is the most basic data structure of Redis. Literally translated is a "simple dynamic string". Redis implements a dynamic string itself, rather than directly using strings in the C language.

Data structure of sds:

Length of occupied space in struct sdshdr {/ / buf length of remaining available space in int len; / / buf int free; / / data space char buf [];}

So a SDS is shown below:

So we see that sds contains three parameters. The length of buf, the remaining length of len,buf, and buf.

Why is it so designed?

You can get the string length directly.

In C language, to get the length of a string, you need to traverse the string with a pointer, and the time complexity is O (n), while the length of SDS is O (1) directly from len.

Eliminate buffer overflows.

Since the C language does not record the length of the string, if you increase the length of a character, it may overflow if you do not pay attention, overwriting the data next to this character. For SDS, increasing the length of the string requires verifying the length of the free. If the free is not enough, the entire buf will be expanded to prevent overflow.

Reduce the re-allocation of memory caused by modifying the string length.

As a high-performance in-memory database, redis needs higher corresponding speed. Strings are also frequently modified with a high probability. SDS removes the relationship between the length of the string and the length of the underlying buf by using the parameter unused space. The length of the buf is not the length of the string either. Based on this sub-design SDS realizes the pre-allocation of space and the release of inertia.

Pre-allocation

If the SDS is modified, if len is less than 1MB, then len = 2 * len + 1byte. This 1 is used to save empty bytes.

If the modified SDS len is greater than 1MB, then len = 1MB + len + 1byte. Inert release

If you shorten the string length of SDS, redis does not immediately reduce the memory occupied by SDS. Just increase the length of the free. At the same time, API is provided to the outside. When you really need to release it, re-reduce the memory occupied by SDS.

Binary security.

A string in C is marked with "\ 0" as the end of the string. SDS marks the end of the string with the length of the len. So SDS can store any binary stream other than a string. Because it is possible that a binary stream contains a "\ 0" in the stream, the string ends prematurely. In other words, SDS does not rely on "\ 0" as the basis for termination.

Compatible with C language

SDS routinely uses "\ 0" as the end of the management. Some common C language string API can also be used.

Linked list

There is no linked list data structure in C language, so Redis implements one on its own. The linked list in Redis is:

Typedef struct listNode {/ / Front node struct listNode * prev; / / Post node struct listNode * next; / / value of the node void * value;} listNode

A very typical data structure of two-way linked lists.

At the same time, functions for the two-way linked list are provided as follows:

/ * * double-ended linked list iterator * / typedef struct listIter {/ / current iterative node listNode * next; / / iterative direction int direction;} listIter;/* * double-ended linked list structure * / typedef struct list {/ / header node listNode * head; / / footer node listNode * tail; / / Node value replication function void * (* dup) (void * ptr) / / Node value release function void (* free) (void * ptr); / / Node value comparison function int (* match) (void * ptr, void * key); / / number of nodes included in the linked list unsigned long len;} list

The structure of the linked list is simple, and the data structure is as follows:

Summarize the nature:

Two-way linked list, a node to find the previous or next node time complexity O (1). List records head and tail, and the time complexity of finding head and tail is O (1). Gets the length of the linked list len time complexity O (1). Dictionaries

The dictionary data structure is very similar to Hashmap in java.

Redis's dictionary consists of three basic data structures. The lowest unit is the hash table node. The structure is as follows:

Typedef struct dictEntry {/ / key void * key; / / value union {void * val; uint64_t u64; int64_t s64;} v; / / point to the next hash table node to form a linked list struct dictEntry * next;} dictEntry

In fact, a hash table node is a node of a single list. Save the pointer to the next node for a while. Key is the key of the node, and v is the value of this node. This v can be either a pointer or a uint64_t or int64_t integer. * next points to the next node.

Link the nodes through an array of hash tables:

Typedef struct dictht {

/ / Hash table array dictEntry * * table; / / Hash table size unsigned long size; / / Hash table size mask, which is used to calculate the index value / / always equal to size-1 unsigned long sizemask; / / the number of nodes already in the hash table unsigned long used;} dictht

Dictht

Through the illustration, we observe:

In fact, if you know the basic data structure of java, you will find that this data structure is very similar to HashMap in java, that is, the structure of array plus linked list.

The data structure of the dictionary:

Typedef struct dict {/ / type-specific function dictType * type; / / private data void * privdata; / / hash table dictht ht [2]; / / rehash index / / when rehash is not in progress, the value is-1 int rehashidx; / * rehashing not in progress if rehashidx = =-1 * / the number of security iterators currently running int iterators / * number of iterators currently running * /} dict

DictType is a set of methods and the code is as follows:

/ * * Dictionary type specific function * / typedef struct dictType {/ / function to calculate hash value unsigned int (* hashFunction) (const void * key); / / function to copy key void * (* keyDup) (void * privdata, const void * key); / / function to copy value void * (* valDup) (void * privdata, const void * obj) / / function int (* keyCompare) (void * privdata, const void * key1, const void * key2); / / function void (* keyDestructor) (void * privdata, void * key); / / function void (* valDestructor) (void * privdata, void * obj);} dictType

The data structure of the dictionary is as follows:

Here we can see that one dict has two dictht. Generally speaking, only ht [0] is used, and ht [1] will only be used when rehash occurs during capacity expansion.

When we observe or study a hash structure, the first thing we have to consider is how does the dict insert a data?

Let's sort out the logic of inserting data.

Calculates the hash value of Key. Find the location where hash is mapped to the table array.

If the data already has a key. That means there is a hash collision. The newly added node is connected to the next pointer of the previous node as a node in the linked list.

If the key has multiple collisions, the length of the linked list becomes longer and longer. It will slow down the query speed of the dictionary. To maintain a normal load. Redis rehash the dictionary. To increase the length of the table array. So we need to focus on Redis's rehash. The steps are as follows:

The size of ht [1] is allocated according to the data of ht [0] and the type of operation (expand or shrink). Rehash the data of ht [0] to ht [1]. After the rehash is complete, set ht [1] to ht [0] and generate a new ht [1] backup.

Progressive rehash.

In fact, if the dictionary has a large number of key, reaching tens of millions of levels, rehash will be a relatively long time. Therefore, in order for the dictionary to continue to provide services during rehash. Redis provides a progressive rehash implementation, and the steps for rehash are as follows:

Allocate space for ht [1] so that the dictionary holds both ht [1] and ht [0]. Maintain a rehashidx in the dictionary, which is set to 0, indicating that the dictionary is rehash. During rehash, each operation on the dictionary, in addition to the specified operation, will rehash to ht [1] according to the corresponding key value pair of ht [0] on rehashidx. As the operation progresses, all the data from ht [0] will be rehash to ht [1]. Set the rehashidx of ht [0] to-1, and the progressive rehash ends.

This ensures that the data can be smoothly rehash. Prevent rehash from blocking threads for too long.

During the rehash process, if operations such as delete and update are performed, it will be done on two hash tables. If it is find, then priority in ht [0], if not found, then go to ht [1] to find. If it is insert, then only data will be inserted in ht [1]. This will ensure that the data of ht [1] only increases, while that of ht [0] only decreases.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.