What are the methods of removing weight in Redis 07/19 Update SLTechnology News&Howtos

What are the methods of removing weight in Redis

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "What are the methods of Redis to reweight?" Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let Xiaobian take you to learn "What are the methods of Redis to reweight"!

Unique counting is a very common feature in website systems, such as the website needs to count the number of unique visitors (also known as UV) every day. Counting problems are common, but they can be complex to solve: one is that the amount of counting can be large, such as large sites with millions of people visiting each day, and the amount of data is quite large; the other is that you usually want to expand the dimensions of counting, such as in addition to needing daily UV, you also want to know weekly or monthly UV, which makes the calculation very complex.

In relational database storage systems, the only way to achieve a unique count is to select count(distinct ), which is very simple, but if the data volume is large, the statement execution is very slow. Another problem with relational databases is that insertion performance is not high.

Redis is handy with this kind of counting problem, faster and less resource-intensive than relational databases, and even provides three different methods.

1. based on set

Redis set is used to store unique data sets, through which you can quickly determine whether an element exists in a set, you can also quickly calculate the number of elements in a set, and you can merge sets into a new set. The orders involved are as follows:

The copy code is as follows:

SISSMEMBER key member #Determine if SADD key member exists for member #Add memberSCARD key to collection #Get the number of elements in the collection

The set-based method is simple and effective, accurate in counting, widely applicable, and easy to understand. Its disadvantage is that it consumes a lot of resources (of course, it is much less than a relational database). If the number of elements is large (such as hundreds of millions of counts), the memory consumption is terrible.

2. based on bit

Redis bit can be used to achieve a higher compression than set memory count, it stores the existence of an element information through a bit 1 or 0. For example, the unique visitor count of the website can be set as the offset of the bit user_id, which is set to 1 to indicate that there is access. Using 1 MB of space, the daily access count of more than 8 million users can be stored. The commands involved are as follows: Copy the code code as follows:

SETBIT key offset value #Set bit information GETBIT key offset #Get bit information BITCOUNT key [start end] #Count BITOP operation destkey [key...] #Bitmap Merge

The bit-based method consumes much less space than the set method, but it requires that the elements can be simply mapped to bit offsets, and the application area is much narrower. In addition, the space consumed depends on the maximum offset and has nothing to do with the count value. If the maximum offset is large, the memory consumption is considerable.

3. Based on HyperLogLog

It is difficult to achieve accurate unique counting of large data volumes, but if it is only approximate, there are many efficient algorithms in computational science, among which HyperLog Counting is a very famous algorithm, which can use only about 12 k of memory to achieve hundreds of millions of unique counts, and the error is controlled at about 1%. The commands involved are as follows: Copy the code code as follows:

PFADD key element [element ...] #Add element PFCOUNT key [key...] #Counting

This counting method is really magical, which involves some uniform distribution, random probability, Bernoulli distribution, etc. in statistics. I have not completely understood it. If you are interested, you can study the relevant articles in depth.

The three unique counting methods provided by redis have their own advantages and disadvantages, which can fully meet the counting requirements under different circumstances.

4. Based on bloomfilter

BloomFilter uses bitmaps or bit-set data structures to store data, using bit arrays to represent a collection succinctly, and to quickly determine whether an element already exists in the collection. Although BloomFilter is not 100% accurate, the error rate can be reduced by adjusting the parameters, the number of Hash functions used, and the size of the bit array. This adjustment can completely reduce the error rate to close to zero. It can satisfy most scenes.

If there is a set S = {x1, x2, … xn}, Bloom Filter uses k independent hash functions to map each element of the set to the range {1,…,m}. For any element, the number mapped to serves as the index of the corresponding bit array, and the bit is set to 1. For example, element x1 is mapped to the number 8 by the hash function, so bit 8 of the bit array is set to 1. In the following figure, the set S has only two elements x and y, which are mapped by three hash functions respectively. The mapped positions are (0, 3, 6) and (4, 7, 10) respectively, and the corresponding bits are set to 1:

Now if you want to determine whether another element is in this set, you only need to map it by these three hash functions to see if there is a 0 in the corresponding position. If there is, it means that this element must not exist in this set, otherwise it may exist.

Redis requires a plugin to use Bloom filters: https://blog.csdn.net/u013030276/article/details/88350641.

At this point, I believe that everyone has a deeper understanding of "Redis to heavy methods," may wish to actually operate it! Here is the website, more related content can enter the relevant channels for inquiry, pay attention to us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.