Detailed explanation of compressed linked list ziplist of redis source code analysis tutorial 07/04 Update SLTechnology News&Howtos

Detailed explanation of compressed linked list ziplist of redis source code analysis tutorial

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Preface

Compressed list (ziplist) is a list made up of a series of specially encoded memory blocks, which plays a very important role in data storage optimization of Redis. This article summarizes a very common data structure used in redis to compress the linked list ziplist. It is not excessive to say that this data structure is ubiquitous in redis. In addition to linked lists, many other data structures also use it for transition, such as SortedSet mentioned in the previous article. There is no more to say below, let's take a look at the detailed introduction.

1. Brief introduction of ziplist data structure of compressed linked list

First, look at the structure of ziplist as a whole, as shown in the following figure:

Compressed linked list ziplist structure diagram

You can see that there are many fields and different byte sizes, but this is the essence of the compressed linked list. Let's summarize it in turn.

Field meaning zlbytes this field is the first field of the compressed linked list, is an unsigned integer, and occupies 4 bytes. Used to represent the number of bytes occupied by the entire compressed linked list (including itself). Zltail is an unsigned integer that takes up 4 bytes. Used to store the offset from the header of the compressed linked list to the last entry (not the tail element zlend), used in scenarios that quickly jump to the tail of the linked list. Zllen is an unsigned integer that occupies 2 bytes. Used to store the total number of entry contained in the compressed linked list. Zlend's special entry, used to represent the end of a compressed linked list. Occupies one byte, and the value is always 255.

Summed up as the head and tail of ziplist, let's summarize the top priority entry field.

Generally speaking, an entry consists of three prevlen,encoding,entry-data fields, but when entry is a small integer, the entry-data field is omitted according to the code. The following is summarized in turn:

The first is the field prevlen, which represents the length of the previous entry, which can be encoded in two ways.

When the length is less than 255 bytes, use one byte to store. When the length is greater than or equal to 255, it is stored in five bytes, where the first byte is set to 255. the length of the previous entry is represented by the next four bytes.

Then there is the field encoding: it will be encoded differently depending on the content of the current element, as follows:

1. If the content of the element is a string, the values of encoding are:

The beginning of 00xx xxxx: 00 means that the length of the string is represented by six bit.

01xx xxxx | xxxx xxxx: 01 indicates that the length of the string is represented by 14bit, and the 14 bit are stored at large end.

1000 0000 | xxxx xxxx | xxxx xxxx | xxxx xxxx | xxxx xxxx: 10 indicates that the next four bytes are string length, and the 32 bit are stored at large end.

2. If the content of the element is numeric, the values of encoding are:

1100 0000: indicates that the number occupies the next 2 bytes.

1101 0000: indicates that the number occupies the next 4 bytes.

1110 0000: indicates that the number occupies the next 8 bytes.

1111 0000: indicates that the number occupies the next 3 bytes.

1111 1110: indicates that the number occupies the next 1 byte.

1111 1111: represents the last element in the compressed linked list (special coding).

1111 xxxx: indicates that only the last four digits are used to represent 0,00012 integers, because 0000Magne1110 and 1111 have been occupied, that is to say, the four xxxx digits here can only represent 0001101101, the decimal number is the number 1q13, but redis stipulates that it is used to represent zero twelve, so when we encounter this code, we need to take out the last four digits and subtract 1 to get the correct value.

Finally, there is the field entry-data: if the value of the element is a string, the value of the element itself is saved. If the value of the element is a very small number (according to the above coding rule, that is, 0: 12), there is no this field.

The coding of compressed linked lists is very complex, but this is the essence of this data structure. Let's take a look at an example:

Note: this example is mentioned in the redis source code

/ / compressed linked list composed of element 2pyr5 [0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff] | | zlbytes zltail entries "2" 5 "end// string" Hello World "encoded content [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]

The above is a compressed linked list that uses hexadecimal to represent two elements of 2 and 5.

The first four bytes represent the number of bytes consumed by the entire ziplist, and because redis uses small-end storage, 15 bytes are represented as 0f 00 00. The next four bytes represent the end element offset, which is the offset from the ziplist header (zlbytes) to the last element (note: not the tail node). It is also stored at a small end, so it is represented as 0c 00 00. Then there is a two-byte zllen field that represents the number of elements. In our example, there are two elements, plus small-end storage, so it is represented as 02 00. Then there is the element itself. First, the length of the previous element is represented by a variable length. 2 as the first element, the size of the previous element is 0, so it takes up a byte of 00. According to the coding rules we mentioned above, elements 2 and 5 belong to numbers between 0f12 and need to be encoded in 1111 xxxx format, 2 is converted to binary to 0010, plus 1 is 0011 (the reason for adding 1 has been explained above), and the combined element 2 is represented as 00 f3. The same element 5 is represented as 02 f6. Finally, there is the tail element, which uses the unoccupied code 1111 1111.

Next, we insert a string element Hello World at the end of the compressed linked list to see how to encode the string. According to the agreed coding rules, the length of the previous element is first represented in bytes. Here, the previous element is 5, which takes up a total of two bytes, so first use one byte to represent the length 02 of the previous element, followed by the encoding of the string. The length of the string we want to add is 11 (including spaces), and the conversion to binary is 1011. According to the coding rules of the string, using 0000 1011, converting to hexadecimal is 0b, and finally adding the ASCII code of our string itself, we can get the encoding of the string: [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64].

The entire compressed linked list becomes:

[0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [02 0b 48 65 6c 6c 6c 6f 20 57 6f 72 6c 64] [ff] | | zlbytes zltail entries "2"5"Hello World" end

2. Source code analysis of ziplist command for compressed linked list.

After understanding the above coding rules, let's take a look at some of the operation source code of compressed linked list ziplist. This article will create a compressed linked list, insert elements, delete elements, and find elements to summarize the basic principles of compressed linked lists.

The first is to create:

/ / define the header size of a compressed linked list consisting of zlbytes,zltail and zllen # define ZIPLIST_HEADER_SIZE (sizeof (uint32_t) * 2+sizeof (uint16_t)) / / create a compressed linked list and return a pointer to the linked list unsigned char * ziplistNew (void) {/ / the reason for + 1 here is that the trailing element occupies one byte, which is also the minimum size of a compressed linked list unsigned int bytes = ZIPLIST_HEADER_SIZE+1 / / allocate memory unsigned char * zl = zmalloc (bytes); / / set the linked list size ZIPLIST_BYTES (zl) = intrev32ifbe (bytes); / / set the offset of the last element ZIPLIST_TAIL_OFFSET (zl) = intrev32ifbe (ZIPLIST_HEADER_SIZE); / / set the number of elements ZIPLIST_LENGTH (zl) = 0; / / set the trailing element (above is only the application space) zl [bytes-1] = ZIP_END; return zl;}

The logic of creating a compressed linked list is simple: apply for a fixed space containing header and tail nodes, and then initialize the linked list context.

Compared with creating, the source code for adding elements is very lengthy. In order to make it easier to understand, let's comb through the logic of adding elements before looking at the source code.

First we need to find the size of the previous element that specified the insertion location, because this attribute is part of the new element. Then we encode the current element to get the corresponding encoding field and the field of the actual element value. The prevlen field of the successor element of the newly inserted element needs to be updated because the element before it has changed. Cascading updates can occur here (there is also a problem with deleting elements) because the prevlen field size is variable.

The above three steps are the core steps, and the rest are operations such as updating the tail node offset and modifying the number of linked list elements. Of course, since compressed linked lists are based on arrays, memory copies are also essential when inserting or deleting elements.

After summing up the above steps, we begin to analyze the source code step by step, which is relatively long. Take your time:

/ / the four parameters are: compressed linked list, insertion position (after the new element is inserted after the p element), element value, element length unsigned char * _ ziplistInsert (unsigned char * zl, unsigned char * p, unsigned char * s, unsigned int slen) {/ / here is the length of the current linked list size_t curlen = intrev32ifbe (ZIPLIST_BYTES (zl)), reqlen; unsigned int prevlensize, prevlen = 0; size_t offset; int nextdiff = 0; unsigned char encoding = 0; long long value = 123456789 Zlentry tail; / / 1. The logical purpose of this section is to obtain the length of the preceding element if (p [0]! = ZIP_END) {/ / if the element in the insertion position is not a tail element, then obtain the length of the element / / here the length of the encoding field is split for convenience, prevlensize saves the length of the encoding field, and prevlen saves the length of the element itself (p, prevlensize, prevlen). } else {/ / if the element in the insertion position is a tail element, then the new element needs to be inserted at the end of the list / / to get the last element of the list (Note: the last element is not equal to the tail element) unsigned char * ptail = ZIPLIST_ENTRY_TAIL (zl) If (ptail [0]! = ZIP_END) {/ / if the last element is not a tail element, then the element is the leading element of the new element. Get the element length prevlen = zipRawEntryLength (ptail); otherwise, there are no elements in the linked list, that is, the leading element of the new element is 0} / / 2. Encode the new element to get the total size of the new element if (zipTryEncoding) {/ / if it is a number, encode it by number reqlen = zipIntSize (encoding);} else {/ / the element length is the string length reqlen = slen;} / / the total length of the new element plus the length of the leading element and the encoding element reqlen + = zipStorePrevEntryLength (NULL,prevlen); reqlen + = zipStoreEntryEncoding (NULL,encoding,slen) / / if the insertion position is not the end of the chain list, the prevlen field of the subsequent elements of the new element should be judged / / according to the above coding rules, this field may need to be expanded to int forcelarge = 0; nextdiff = (p [0]! = ZIP_END)? ZipPrevLenByteDiff (pforce reqlen): 0; if (nextdiff = =-4 & & reqlen

< 4) { nextdiff = 0; forcelarge = 1; } //按照新计算出的数组大小进行扩容，由于新数组的地址可能会改变，因此这里记录旧的偏移量 offset = p-zl; zl = ziplistResize(zl,curlen+reqlen+nextdiff); //在新数组上计算插入位置 p = zl+offset; //如果新插入的元素不是在链表末尾 if (p[0] != ZIP_END) { //把新元素后继元素复制到新的数组中，-1为尾元素 memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff); //新元素的后继元素的prevlen字段 if (forcelarge) zipStorePrevEntryLengthLarge(p+reqlen,reqlen); else zipStorePrevEntryLength(p+reqlen,reqlen); //更新最后一个元素的偏移量 ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen); //当新元素的后继元素不止有一个时，新的尾元素偏移量需要加上nextdiff zipEntry(p+reqlen, &tail); if (p[reqlen+tail.headersize+tail.len] != ZIP_END) { ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff); } } else { //新元素插入到链表尾端，更新尾端偏移量 ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl); } //nextdiff !=0表示后继元素的长度发生变化，因此我们需要级联更新后继元素的后继元素 if (nextdiff != 0) { offset = p-zl; zl = __ziplistCascadeUpdate(zl,p+reqlen); p = zl+offset; } //把新元素写入链表 p += zipStorePrevEntryLength(p,prevlen); p += zipStoreEntryEncoding(p,encoding,slen); if (ZIP_IS_STR(encoding)) { memcpy(p,s,slen); } else { zipSaveInteger(p,value,encoding); } //压缩链表存储元素数量+1 ZIPLIST_INCR_LENGTH(zl,1); return zl;} 分析完插入元素的逻辑，长舒一口气，真的很长，细节也很多。接下来在再看下删除元素的过程，与添加相比，删除相对要简单一些，清空当前元素以后，需要把后继元素一个一个拷贝上来（这也是数组跟链表两个数据结构的差别），然后注意是否需要级联更新，上代码： //参数依次为：压缩链表，删除元素的其实位置，删除元素的个数unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) { unsigned int i, totlen, deleted = 0; size_t offset; int nextdiff = 0; zlentry first, tail; //读取p指向的元素保存在first中 zipEntry(p, &first); for (i = 0; p[0] != ZIP_END && i < num; i++) { //统计总共删除的长度 p += zipRawEntryLength(p); //统计实际删除元素的个数 deleted++; } //需要删除元素的字节数 totlen = p-first.p; if (totlen >

0) {if (p [0]! = ZIP_END) {/ / determine whether the size of the element has changed nextdiff = zipPrevLenByteDiff (pjinfirst.decirawlen); / / modify the prevlen field p-= nextdiff; zipStorePrevEntryLength of the element after deletion; / / update the offset of the last element ZIPLIST_TAIL_OFFSET (zl) = intrev32ifbe (intrev32ifbe (ZIPLIST_TAIL_OFFSET (zl))-totlen) / / when there is more than one subsequent element of the deleted element, the new end element offset needs to be added with nextdiff zipEntry (p, & tail); if (p [tail.headersize+tail.len]! = ZIP_END) {ZIPLIST_TAIL_OFFSET (zl) = intrev32ifbe (intrev32ifbe (ZIPLIST_TAIL_OFFSET (zl)) + nextdiff) } / / move the remaining elements to the front memmove (first.p,p, intrev32ifbe (ZIPLIST_BYTES (zl))-(p-zl)-1);} else {/ / delete directly to the end of the linked list, so no memory copy is required, just modify the offset of the last element ZIPLIST_TAIL_OFFSET (zl) = intrev32ifbe ((first.p-zl)-first.prevrawlen) } / / resize array size offset = first.p-zl; zl = ziplistResize (zl, intrev32ifbe (ZIPLIST_BYTES (zl))-totlen+nextdiff); / / modify the number of linked list elements ZIPLIST_INCR_LENGTH (zl,-deleted); p = zl+offset; / / nextdiff! = 0 indicates that the element size has changed and needs to be cascaded to update if (nextdiff! = 0) zl = _ ziplistCascadeUpdate (zl,p);} return zl;}

Finally, let's take a look at the find operation of the element:

/ / parameters are as follows: compressed linked list, to find the value of the element, to find the length of the element, the number of elements skipped between each comparison unsigned char * ziplistFind (unsigned char * p, unsigned char * vstr, unsigned int vlen, unsigned int skip) {int skipcnt = 0; unsigned char vencoding = 0; long long vll = 0; / / cycle while (p [0]! = ZIP_END) {unsigned int prevlensize, encoding, lensize, len as long as the element is not yet at the end Unsigned char * Q; / query the prevlen field ZIP_DECODE_PREVLENSIZE (p, prevlensize) of the current element of the linked list; / / query the encoding field ZIP_DECODE_LENGTH (p + prevlensize, encoding, lensize, len) of the current element of the linked list; Q = p + prevlensize + lensize / / reached the element position to be compared if (skipcnt = = 0) {/ / if the string if (ZIP_IS_STR (encoding)) {/ / compares the current element in the linked list with the string to be found if (len = = vlen & & memcmp (Q, vstr, vlen) = = 0) {/ / matches successfully, then look for the pointer to the element return p } else {/ / if the current element is numeric and vencoding is 0 if (vencoding = = 0) {/ / attempt to digitally encode the value you are looking for if (! zipTryEncoding (vstr, vlen, & vll, & vencoding)) {/ / if the encoding fails, the element you are looking for is not a number at all / / and then set vencoding to the maximum value Act as a marker / / that is, you don't have to try to encode the value you are looking for into a number later. Vencoding = UCHAR_MAX } assert (vencoding);} / if vencoding! = UCHAR_MAX, the element you are looking for is successfully encoded as a number if (vencoding! = UCHAR_MAX) {/ / fetch the element long long ll = zipLoadInteger (Q, encoding) in the current linked list by number; if (ll = = vll) {/ / if the two numbers are equal, return the element pointer return p } / / reset the number of elements to be skipped skipcnt = skip;} else {/ / skip element skipcnt--;} / / traverse the next element p = Q + len;} / / traverse the entire linked list and no element return NULL;} is found

At this point, we have summarized the four basic operating principles of creating, adding, deleting and finding compressed linked lists.

III. Summary of ziplist data structure of compressed linked list

Compressed linked list ziplist is widely used in redis, and it is the most characteristic data structure in redis. In fact, the essence of the data structure lies in the coding rules summarized in the first part of the article, first clarify this part of the content, the following source code can be simply read to deepen understanding.

I have to say that the part of the source code is really a bit lengthy, it does take patience, and I have a big head in the process of reading it myself. If you are interested in the source code, it is recommended that you sort out what needs to be done for an operation (such as the insertion element mentioned above) first, and then look at the code for better understanding.

Last but not least, since ziplist in redis has such a high memory utilization, why provide a normal linked list structure for users to use?

There is no standard answer to this question. The benevolent see the wise see the wise.

Summary

The above is the whole content of this article, I hope that the content of this article has a certain reference and learning value for your study or work, if you have any questions, you can leave a message and exchange, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.