In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
What is the role of the internal data structure quicklist in Redis? I believe many inexperienced people don't know what to do about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
Overview of quicklist
The upper list data type exposed by Redis, often used as a queue. For example, it supports the following operations:
Lpush: insert data on the left (that is, the list header).
Rpop: delete the data on the right (that is, at the end of the list).
Rpush: insert data on the right (that is, at the end of the list).
Lpop: delete the data on the left (that is, the list header).
These operations are all O (1) time complexity.
Of course, list also supports access operations in any intermediate location, such as lindex and linsert, but they all need to traverse the list, so the time complexity is high, which is O (N).
Generally speaking, list has some characteristics: it is a list that can maintain the order of data items (the order of each data item is determined by the insertion position), it is easy to append and delete data at both ends of the table, and the access to the middle position has O (N) time complexity. Isn't that the characteristic of a two-way linked list?
The internal implementation of list, quicklist, is just a two-way linked list. In the comments in the file header of quicklist.c, quicklist is described as follows:
A doubly linked list of ziplists
It is indeed a two-way linked list, and it is a two-way linked list of ziplist.
What does this mean?
We know that a two-way linked list is made up of multiple nodes (Node). This description means that each node of a quicklist is a ziplist. Ziplist, we already introduced it in the last article.
Ziplist itself is also a list that maintains the order of data items (by insertion position) and is a memory-tight list (data items are adjacent in memory). For example, a quicklist with 3 nodes, if the ziplist of each node contains 4 data items, then externally, the list contains a total of 12 data items.
Why is the structure of quicklist designed in this way? To sum up, it is probably another compromise between space and time:
A two-way linked list is convenient for push and pop operations on both ends of the table, but it costs a lot of memory. First of all, it not only needs to save data on each node, but also saves two additional pointers; secondly, each node of the two-way linked list is a separate block of memory with discontiguous addresses, and memory fragmentation is easy to occur if there are more nodes.
Ziplist is very efficient because it is a whole block of contiguous memory. However, it is not conducive to modification operations, each data change will cause a memory realloc. Especially when the length of ziplist is very long, a realloc may result in a large number of copies of data, which further degrades performance.
So, combining the advantages of two-way linked list and ziplist, quicklist came into being.
However, this also raises a new question: how much ziplist is appropriate for a quicklist node? For example, 12 data items are also stored, either a quicklist contains 3 nodes, and the ziplist of each node contains 4 data items, or a quicklist contains 6 nodes, and the ziplist of each node contains 2 data items.
This is another difficult problem that needs to be balanced. Let's just analyze it in terms of storage efficiency:
The shorter the ziplist on each quicklist node, the more fragmented the memory. If there are too many memory fragments, it is possible to produce a lot of small fragments that can not be used in memory, thus reducing storage efficiency. At the extreme of this situation, the ziplist on each quicklist node contains only one data item, which degenerates into a normal two-way linked list.
The longer the ziplist on each quicklist node, the more difficult it is to allocate large chunks of contiguous memory space to the ziplist. There may be situations where there are a lot of small chunks of free space in memory (they add up to a lot), but you can't find a large enough free space to allocate to ziplist. This also reduces storage efficiency. At the extreme of this situation, there is only one node for the entire quicklist, and all the data items are assigned to the ziplist of the only node. This has actually degenerated into a ziplist.
It can be seen that the ziplist on a quicklist node should maintain a reasonable length. How long is it reasonable? This may depend on the specific application scenario. In fact, Redis provides a configuration parameter, list-max-ziplist-size, so that users can adjust to their own situation.
List-max-ziplist-size-2
Let's explain the meaning of this parameter in detail. It can take a positive value or a negative value.
When a positive value is taken, the length of ziplist on each quicklist node is limited according to the number of data items. For example, when this parameter is configured to 5, it means that the ziplist of each quicklist node contains up to five data items.
When a negative value is taken, the length of the ziplist on each quicklist node is limited according to the number of bytes occupied. At this point, it can only take five values from-1 to-5, each with the following meaning:
-5: the ziplist size on each quicklist node cannot exceed 64 Kb. (note: 1kb = > 1024 bytes)
-4: the ziplist size on each quicklist node cannot exceed 32 Kb.
-3: the ziplist size on each quicklist node cannot exceed 16 Kb.
-2: the ziplist size on each quicklist node cannot exceed 8 Kb. (- 2 is the default value given by Redis)
-1: the ziplist size on each quicklist node cannot exceed 4 Kb.
In addition, list is designed to be able to store long lists of data. For example, the Redis website gives this tutorial: Writing a simple Twitter clone with PHP and Redis, which uses list to store Twitter-like timeline data.
When the list is long, the data that is most easily accessed is likely to be the data at both ends, and the data in the middle is accessed less frequently (and with low performance). If the application scenario fits this feature, list also provides an option to compress the intermediate data nodes, further saving memory space. The configuration parameter list-compress-depth of Redis is used to complete this setting.
List-compress-depth 0
This parameter represents the number of nodes that are not compressed at both ends of an quicklist. Note: the number of nodes here refers to the number of nodes in the quicklist two-way linked list, not the number of data items in the ziplist. In fact, the ziplist on a quicklist node, if compressed, is compressed as a whole.
The value of parameter list-compress-depth is as follows:
0: is a special value, which means that it is not compressed. This is the default value for Redis.
1: it means that one node at each end of the quicklist is not compressed, and the node in the middle is compressed.
2: indicates that there are 2 nodes at each end of the quicklist that are not compressed, and the nodes in the middle are compressed.
3: it means that there are 3 nodes at each end of the quicklist that are not compressed, and the nodes in the middle are compressed.
And so on...
Because 0 is a special value, it is easy to see that the head and tail nodes of the quicklist are always uncompressed for quick access at both ends of the table.
Redis adopts LZF--, a lossless compression algorithm, for the compression algorithm of quicklist internal nodes.
Data structure definition of quicklist
Quicklist-related data structure definitions can be found in quicklist.h:
Typedef struct quicklistNode {struct quicklistNode * prev; struct quicklistNode * next; unsigned char * zl; unsigned int sz; / * ziplist size in bytes * / unsigned int count: 16; / * count of items in ziplist * / unsigned int encoding: 2; / * RAW==1 or LZF==2 * / unsigned int container: 2; / * NONE==1 or ZIPLIST==2 * / unsigned int recompress: 1; / * was this node previous compressed? * / unsigned int attempted_compress: 1 / * node can't compress; too small * / unsigned int extra: 10; / * more bits to steal for future usage * /} quicklistNode;typedef struct quicklistLZF {unsigned int sz; / * LZF size in bytes*/ char compressed [];} quicklistLZF;typedef struct quicklist {quicklistNode * head; quicklistNode * tail; unsigned long count; / * total count of all entries in all ziplists * / unsigned int len; / * number of quicklistNodes * / int fill: 16 / * fill factor for individual nodes * / unsigned int compress: 16; / * depth of end nodes not to compress;0=off * /} quicklist
The quicklistNode structure represents a node of quicklist, and the meanings of each field are as follows:
Prev: a pointer to the previous node of the linked list.
Next: a pointer to the node behind the linked list.
Zl: data pointer. If the data of the current node is not compressed, it points to a ziplist structure; otherwise, it points to a quicklistLZF structure.
Sz: represents the total size of the ziplist that zl points to (including zlbytes, zltail, zllen, zlend, and individual data items). It is important to note that if the ziplist is compressed, the value of the sz is still the ziplist size before compression.
Count: indicates the number of data items contained in the ziplist. Only 16bit is in this field. We will calculate whether this 16bit is enough later.
Encoding: indicates whether ziplist is compressed (and which compression algorithm is used). Currently, there are only two values: 2 for compression (and using the LZF compression algorithm) and 1 for no compression.
Container: is a reserved field. It was originally designed to indicate whether a quicklist node stores data directly, or uses ziplist to store data, or uses other structures to store data (used as a data container, so it is called container). However, in the current implementation, this value is a fixed value of 2, which means that ziplist is used as the data container.
Recompress: when we use a command like lindex to view an originally compressed data, we need to decompress the data temporarily, then set recompress=1 to make a mark, and then recompress the data when we have a chance.
Attempted_compress: this value is only useful for Redis's automated testing programs. We don't have to worry about it.
Extra: other extension fields. Currently, it is not used in the implementation of Redis.
The quicklistLZF structure represents a compressed ziplist. Where:
Sz: represents the compressed ziplist size.
Compressed: a flexible array (flexible array member) that holds a compressed array of ziplist bytes.
The data structure that really represents quicklist is the struct of quicklist with the same name:
Head: a pointer to the header node (the first node on the left).
Tail: a pointer to the tail node (the first node on the right).
Count: the sum of all ziplist data items.
Len: the number of quicklist nodes.
Fill: 16bit _ Ziplist size setting, which stores the value of the list-max-ziplist-size parameter.
Compress: 16bit, the node compression depth setting, which stores the value of the list-compress-depth parameter.
The above figure is an example of a structure diagram of quicklist. The ziplist size configuration and node compression depth configuration corresponding to the example in the figure are as follows:
List-max-ziplist-size 3list-compress-depth 2
A few points we need to pay attention to in this example are:
There are two orange nodes at each end, which are not compressed. Their data pointer zl points to the real ziplist. The other nodes in the middle are compressed, and their data pointer zl points to the compressed ziplist structure, that is, a quicklistLZF structure.
There are two items of data in the ziplist on the left head node, one item in the ziplist on the right tail node, and three items in the ziplist on the other nodes in the middle (including the interior of the compressed node). This represents a state after multiple push and pop operations on both sides of the table.
Now let's roughly calculate whether the 16bit is sufficient for the count field in the quicklistNode structure.
We already know that the ziplist size is limited by the list-max-ziplist-size parameter. There are two cases according to positive and negative values:
When this parameter is positive, it happens to represent the maximum value of the data item contained in the ziplist pointed to by the zl in a quicklistNode structure. The list-max-ziplist-size parameter is stored by the fill field of the quicklist structure, while the fill field is 16bit, so the value it can express can be represented by 16bit.
When this parameter is negative, the maximum length of ziplist that can be represented is 64 Kb. Each data item in ziplist needs at least 2 bytes to represent: 1-byte prevrawlen,1-byte data (the len field and data are merged into one; see the previous article for details). Therefore, the number of data items in ziplist will not exceed 32K, which is enough to express in 16bit.
In fact, in the current implementation of quicklist, the size of the ziplist will be subject to additional restrictions and will not reach the maximum analyzed here at all.
Let's move on to the code analysis phase.
The creation of quicklist
When we use the lpush or rpush command to insert data into a non-existent list for the first time, Redis first calls the quicklistCreate interface to create an empty quicklist.
Quicklist * quicklistCreate (void) {struct quicklist * quicklist; quicklist = zmalloc (sizeof (* quicklist)); quicklist- > head = quicklist- > tail = NULL; quicklist- > len = 0; quicklist- > count = 0; quicklist- > compress = 0; quicklist- > fill =-2; return quicklist;}
In many books that introduce data structures, the implementation of two-way linked lists often add an extra free header node, mainly for the convenience of insert and delete operations. As you can see from the quicklistCreate code above, quicklist is a bi-directional linked list without empty header nodes (both head and tail are initialized to NULL).
Push operation of quicklist
The push operation of quicklist is implemented by calling quicklistPush.
Void quicklistPush (quicklist * quicklist, void * value, const size_t sz, int where) {if (where = = QUICKLIST_HEAD) {quicklistPushHead (quicklist, value, sz);} else if (where = = QUICKLIST_TAIL) {quicklistPushTail (quicklist, value, sz);}} / * Add new entry to head node of quicklist. * * Returns 0 if used existing head. * Returns 1 if new head created. * / int quicklistPushHead (quicklist * quicklist, void * value, size_t sz) {quicklistNode * orig_head = quicklist- > head; if (_ quicklistNodeAllowInsert (quicklist- > head, quicklist- > fill, sz)) {quicklist- > head- > zl = ziplistPush (quicklist- > head- > zl, value, sz, ZIPLIST_HEAD); quicklistNodeUpdateSz (quicklist- > head);} else {quicklistNode * node = quicklistCreateNode () Node- > zl = ziplistPush (ziplistNew (), value, sz, ZIPLIST_HEAD); quicklistNodeUpdateSz (node); _ quicklistInsertNodeBefore (quicklist, quicklist- > head, node);} quicklist- > count++; quicklist- > head- > count++; return (orig_head! = quicklist- > head);} / * Add new entry to tail node of quicklist. * * Returns 0 if used existing tail. * Returns 1 if new tail created. * / int quicklistPushTail (quicklist * quicklist, void * value, size_t sz) {quicklistNode * orig_tail = quicklist- > tail; if (_ quicklistNodeAllowInsert (quicklist- > tail, quicklist- > fill, sz)) {quicklist- > tail- > zl = ziplistPush (quicklist- > tail- > zl, value, sz, ZIPLIST_TAIL); quicklistNodeUpdateSz (quicklist- > tail);} else {quicklistNode * node = quicklistCreateNode () Node- > zl = ziplistPush (ziplistNew (), value, sz, ZIPLIST_TAIL); quicklistNodeUpdateSz (node); _ quicklistInsertNodeAfter (quicklist, quicklist- > tail, node);} quicklist- > count++; quicklist- > tail- > count++; return (orig_tail! = quicklist- > tail);}
Whether you insert data in the head or tail, there are two situations:
If the ziplist size on the header node (or tail node) does not exceed the limit (that is, _ quicklistNodeAllowInsert returns 1), the new data is inserted directly into the ziplist (calling ziplistPush).
If the ziplist on the header node (or tail node) is too large, create a new quicklistNode node (and, accordingly, a new ziplist), and then insert the newly created node into the quicklist two-way linked list (call _ quicklistInsertNodeAfter).
In the implementation of _ quicklistInsertNodeAfter, the nodes are also compressed according to the configuration of list-compress-depth. Its implementation is rather tedious, so we won't discuss it here.
Other operations of quicklist
There are many operations in quicklist, and the implementation details are complicated, so we will not analyze the source code one by one here. We will briefly introduce some of the more important operations.
The pop operation of quicklist is implemented by calling quicklistPopCustom. The implementation process of quicklistPopCustom is basically the opposite of quicklistPush. First, delete the corresponding data item from the ziplist of the header or tail node. If the ziplist is empty after deletion, then the corresponding header or tail node should also be deleted. After deletion, it may also involve the decompression of the inner nodes.
Quicklist not only inserts from the head or tail, but also from any specified location. QuicklistInsertAfter and quicklistInsertBefore insert data items after and before the specified location, respectively. This kind of operation of inserting data at any specified location is complicated and has many logical branches.
When the size of the ziplist where the insertion position is located does not exceed the limit, just insert it into the ziplist.
When the ziplist size of the insertion position exceeds the limit, but the insertion position is at both ends of the ziplist, and the ziplist size of the adjacent quicklist linked list node does not exceed the limit, then it is inserted into the ziplist of the adjacent quicklist linked list node instead.
When the ziplist size of the insertion position exceeds the limit, but the insertion position is located at both ends of the ziplist, and the ziplist size of the adjacent quicklist linked list node also exceeds the limit, you need to create a new quicklist linked list node to insert.
For other cases where the size of the ziplist at the insertion location exceeds the limit (mainly corresponding to the case of inserting data in the middle of the ziplist), you need to split the current ziplist into two nodes, and then insert data on one of the nodes.
QuicklistSetOptions is used to set the ziplist size configuration parameter (list-max-ziplist-size) and node compression depth configuration parameter (list-compress-depth). The code is simple by setting the corresponding values to the fill field and the compress field of the quicklist structure, respectively.
After reading the above, have you mastered the function of the internal data structure quicklist in Redis? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.