Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of Redis Database Distribution

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article shares with you the content of a sample analysis of Redis database distribution. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Question: 100-200 million data needs to be cached, how to design it?

1 hash remainder partition

200 million records are 200 million k _ quotient v. Suppose three machines form a cluster, and every user reads and writes the hash value based on the number of public: hash (key)% N machines, and uses it to determine which node the data is mapped to. When you take the data, you only need one according to the formula on the corresponding machine, and you can get the value with key.

Advantages: simple and crude, direct and effective, only need to estimate the data and plan the nodes, such as 3, 8, 10, to ensure data support for a period of time. Use the Hash algorithm to make a fixed part of the requests fall on the same server, so that each server regularly processes part of the requests (and maintains the information of these requests), which plays the role of load balancing + divide and conquer.

Disadvantages: it is troublesome to expand or reduce the capacity of the originally planned nodes. Regardless of expansion or contraction, each data change leads to a change in the node, and the mapping relationship needs to be recalculated. There is no problem when the number of servers is fixed. If elastic expansion or failure shutdown is required, the original modeling formula will change: Hash (key) / 3 will become Hash (key) /?. At this point, the result of the remainder operation of the address will change greatly, and the server obtained according to the formula will become uncontrollable. When a certain redis machine goes down, due to the change in the number of machines, it will cause hash to reshuffle all the remaining data.

2 consistent hash algorithm partition

A consistent Hash solution is proposed to minimize the impact on client-to-server mapping when the number of servers changes.

2.1 consistent hash ring

The consistent hash algorithm must have a hash function and generate hash values according to the algorithm. All possible hash values of the algorithm will form a full set, which can be a hash space [0,2 ^ 32-1], which is a linear space, but in the algorithm, we connect it end to end (0 = 2 ^ 32) through appropriate logic control, so that it logically forms a ring space.

It is also in accordance with the method of using modeling, and the node modeling method described in the previous notes is to model the number of nodes (servers). The consistent Hash algorithm takes the module for 2 ^ 32. To put it simply, the consistent Hash algorithm organizes the entire hash value space into a virtual ring, such as assuming that the value space of a hash function H is 0-2 ^ 32-1 (that is, the hash value is a 32-bit unsigned shaping). The whole hash ring is shown in the following figure: the whole space is organized clockwise, and the point directly above the circle represents the first point to the right of point 0. And so on, 2, 3, 4,. Until 2 ^ 32-1, that is to say, the first point on the left of 0 represents 2 ^ 32-1, 0 and 2 ^ 32-1 coincide in the direction of zero. We call this circle of 2 ^ 32 points Hash ring.

2.2 Node mapping

Map each IP node in the cluster to a location on the ring.

Hash each server using Hash, and you can select the server's IP or hostname as the keyword for hashing, so that each machine can determine its location on the hash ring. If the four nodes NodeA, B, C, D are calculated by the hash function of the IP address (hash (ip)), the location in the ring space after using the IP address hash is as follows:

2.3 drop key rules

When we need to store a kv key-value pair, we first calculate the hash value of key, hash (key), and use the same function Hash to calculate the hash value of the key and determine the location of this data on the ring. From this position, we "walk" clockwise along the ring. The first server encountered is the server to which it should be located, and the key-value pair is stored on the node.

If we have four data objects, Object A, Object B, Object C and Object D, after hashing, the positions in the ring space are as follows: according to the consistent Hash algorithm, data A will be defined on Node A, B on Node B, C on Node C, and D on Node D.

2.4 advantages and disadvantages

Advantages: fault tolerance and scalability

Fault tolerance:

Assuming that Node C goes down, you can see that objects A, B, and D are not affected, and only C objects are relocated to Node D. In general, in the consistent Hash algorithm, if a server is not available, the data affected is only the data between this server and the previous server in its ring space (that is, the first server encountered when walking counterclockwise). The rest will not be affected. To put it simply, if C is dead, only the data between B and C is affected, and these data will be transferred to D for storage.

Disadvantages: data skew (few nodes are not suitable)

When there are too few service nodes, the consistency Hash algorithm is easy to cause data skew due to uneven distribution of nodes (most of the cached objects are cached on a certain server).

For example, there are only two servers in the system:

3 Hash slot calculation

In order to solve the skew problem of consistent hash algorithm

To solve the problem of uniform distribution, another layer is added between the data and the node, which is called slot, which is used to manage the relationship between the data and the node. now it is equivalent to putting slots on the nodes and data in the slots.

Slots solve the problem of granularity, which is equivalent to making the granularity larger, so that the data can be moved easily.

The hash solves the mapping problem and uses the hash value of key to calculate the slot to facilitate data allocation.

A cluster can only have 16384 slots, numbered 0-16383 (0-2 ^ 14-1). These slots are allocated to all primary nodes in the cluster, and the allocation policy is not required. You can specify which numbered slots are assigned to which primary node. The cluster records the correspondence between nodes and slots. After solving the relationship between the node and the slot, we need to hash the key, then take the remainder of 16384, and then fall into the corresponding slot when the remainder is a few key. Slot = CRC16 (key)% 16384. Move data in slots, because the number of slots is fixed, so it is easier to deal with, so the problem of data movement is solved.

There are 16384 hash slots built into the Redis cluster, and redis maps hash slots to different nodes according to the number of nodes. When you need to place a key-value in the Redis cluster, redis first uses the crc16 algorithm to calculate a result on the key, and then calculates the remainder of the result to 16384, so that each key corresponds to a hash slot numbered between 0 and 16383, that is, it is mapped to a node. In the following code, An and B of key are on Node2, and C of key is on Node3

Thank you for reading! This is the end of this article on "sample Analysis of distributed Redis Database". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report