Reduce Redis memory footprint 07/19 Update SLTechnology News&Howtos

Reduce Redis memory footprint

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)06/01 Report--

1. Advantages of reducing redis memory footprint

1. Helps reduce the time it takes to create and load snapshots

2. Improve the efficiency of loading AOF files and rewriting AOF files

3. Shorten the time required to synchronize from the server

4. Redis can store more data without adding additional hardware.

Back to the top.

2. Short structure

Redis provides a set of configuration options for lists, collections, hashes, and ordered collections that allow redis to store shorter structures in a more economical way.

Back to the top.

2.1. ziplist compressed list (list, hash, sequel and)

The storage method commonly used

When lists, hashes, and ordered collections are short or small, redis will use a compact storage method called ziplist to store these structures.

Ziplist is an unstructured representation of three different types of objects: list, hash, and ordered collection. It stores data in a serialized manner, which needs to be decoded every time it is read and encoded every time it is written.

The difference between a bidirectional list and a compressed list:

In order to understand that compressed lists save more memory than other data structures, we take the list structure as an example to conduct an in-depth study.

A typical two-way list

In a typical two-way list, each value is represented by a node. Each node has pointers to the previous and second nodes of the linked list, as well as a pointer to the string values contained by the node.

The string values contained in each node are stored in three parts. This includes the length of the string, the number of bytes remaining in the string value, and the string itself that ends with a null character.

Example:

If a node stores a 'abc' string, it is conservatively estimated to require 21 bytes of additional overhead on a 32-bit platform (three pointers + two int+ null characters, that is, 3' 4'2'4'1'21).

It can be seen from the example that storing a 3-byte string requires at least 21 bytes of additional overhead.

Ziplist

A compressed list is a sequence of nodes, each containing two lengths and a string. The first length records the length of the previous node (used to traverse the compressed list from back to front); the second length is the length of the current point of the notebook; the stored string.

Example:

Both lengths of the storage string 'abc', can be stored in 1 byte, so the additional overhead is 2 bytes (two lengths 1' 1'2).

Conclusion:

Compressing lists reduces additional overhead by avoiding storing additional pointers and metadata.

Configuration:

1 # list2 list-max-ziplist-entries 512 # indicates the maximum number of elements allowed to be included 3 list-max-ziplist-value 64 # indicates the maximum volume of storage allowed by the compression node 4 # hash # when any limit is exceeded, 5 hash-max-ziplist-entries 5126 hash-max-ziplist-value 647 # zset8 zset-max-ziplist-entries 1289 zset-max-ziplist-value 64 will not be stored in ziplist mode

Test list:

1. Establish the test.php file

1 # test.php2

At this time, the test-list contains 512 pieces of data and does not exceed the restrictions in the configuration file.

2. Push another piece of data into test-list

At this point, the test-list contains 513 pieces of data, which is larger than the limit of 512 items in the configuration file. The index will abandon the ziplist storage mode and adopt its original linkedlist storage mode.

Hashing is the same as ordered sets.

Back to the top.

2.2.The set of intset integers (set)

As a prerequisite, all member contained in the collection can be resolved to decimal integers.

Storing collections in an ordered array can not only reduce memory consumption, but also improve the speed of collection operations.

Configuration:

1 set-max-intset-entries 512 # limit the number of member in the collection. If it exceeds the limit, intset storage will not be taken.

Test:

Create a test.php file

1 # test.php2

Back to the top.

2.3. Performance issues

Regardless of lists, hashes, ordered collections, collections, when the limit is exceeded, it will be converted to a more typical type of underlying structure. Because as the volume of compact structures becomes larger, the speed of operating these structures will become slower and slower.

Test:

# list will be used for representative testing

Test ideas:

1. Push 50000 pieces of data to test-list in the default configuration to check the required time; then use rpoplpush to push all the test-list data into the new list list-new, and view the required time

2. Modify the configuration, list-max-ziplist-entries 100000, and then do the same above

3. Compare the time and draw a conclusion

Test under default configuration:

1. Insert data and view time

1 # test1.php 2

The result took 4 seconds.

2. Execute the corresponding command to view the time-consuming

1 # test2.php 2

Change the test under the profile

1. Modify the configuration file first

List-max-ziplist-entries 100000 # modify this value a little larger to better highlight the impact on performance

List-max-ziplist-value 64 # this value can not be modified

2. Insert data

Execute test1.php

The result is that it takes 12 seconds.

3. Execute the corresponding command to view the time-consuming

Execute test2.php

The result is that the number of execution is 50000 and the time is 12s.

Conclusion:

There is a difference of 8 seconds between 50000 pieces of test data in this machine. Under high concurrency, long compressed list and large integer set will not play any optimization, but will degrade the performance.

Back to the top.

3. Sheet structure

The essence of slicing is to divide the data into smaller parts based on simple rules, and then decide where to send the data according to the part to which the data belongs. Many databases use this technology to expand storage space and increase the load they can handle.

Combined with what we have mentioned above, it is not difficult to find the significance of sharding structure for redis. Therefore, we need to make appropriate adjustments to the configuration of ziplist and intset in the configuration file.

Back to the top.

3.1. Fragmented hash

# ShardHash.class.php

View Code

The hash fragment mainly calculates the fragment key ID based on the base key and the key contained in the hash, and then splices with the base key to form a complete fragment key. When executing hset and hget and most hash commands, you need to process the key (field) through the shardKey method before you can proceed to the next step.

Back to the top.

3.2. Piecewise collection

How to construct a fragmented collection to make it more memory-efficient and more powerful? The main idea is to convert the data stored in the collection into data that can be parsed into decimal without changing its original function. As mentioned earlier, intset storage is used when all members of the collection can be parsed into decimal data, which not only saves memory, but also improves response performance.

Example:

If you want a large website, you need to store unique user visits for each day. Then you can use to convert the unique identifier of the user into a decimal number and then store it in a sharded set.

# ShardSet.class.php

View Code

Back to the top.

4. Package and convert the information into storage bytes

Combined with the slicing technology mentioned above, the string slicing structure is used to store information for a large number of continuous ID users.

Use fixed-length strings to allocate n bytes for each ID to store the corresponding information.

Next, we will use the examples of storage users' countries and provinces to explain:

If a user needs to store the information of China and Guangdong Province and uses the utf8 character set, it will need to consume at least 5 bytes, 3 bytes and 15 bytes. If the site has a large number of users, this approach will take up a lot of resources. The next method we use is that each user only needs to take up two bytes to store information.

Specific train of thought steps:

1. First of all, we set up corresponding 'information tables' for the information of the country and the provinces of each country.

2. After the establishment of the "information form", it also means that each country and province has a corresponding index number.

3. When you see this, you should all think of it, yes, that is, to use two indexes as the information stored by users, but it should be noted that we also need to deal with these two indexes accordingly.

4. Treat the index as an ASCII code and convert it to the character specified by the corresponding ASCII (0,255).

5. Use the sharding technology mentioned above and the fixed-length sharding string structure to find out the storage location of the user (a string in the redis cannot exceed 512m)

6. Write and retrieve information (getrange, setrange)

Implementation code:

# PackBytes.class.php

one

Test:

1. The information processed by dealData is called "information table".

2. SaveCode ()

UserID national province 0 China Guangdong 13 Japan Guisunzi District 15 Japan Wangba District

、

3. GetMessage ()

Reference books:

"Redis practice" by Josiah.Carlson

Translated by Huang Jianhong

(the above are some of my own opinions. If there are any deficiencies or mistakes, please point them out.)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.