In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
In this issue, the editor will bring you about the principle and application of consistent Hash. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.
Before we talk about consistency Hash, let's discuss a problem.
Question: now that there are hundreds of millions of users who generate tens of millions of orders every day, how to divide the orders into pieces and tables?
Xiao A: we can slice according to the Mantissa of the mobile phone number, and the mobile phone number with the same Mantissa is written into the same piece / the same table.
Boss: I hope to query all the order information of this member through the member ID. If it is divided into parts / tables according to the mobile number, the premise is that the user's mobile number remains the same, and when querying the order list, you need to query the user's mobile number in advance. It is not reasonable to use the Mantissa of the mobile number.
Little B: according to the boss's way of thinking, we need to find a unique constant attribute for sharding / subtable.
Boss: mystery smile ~
Xiao B: (full of confidence) what the members keep unchanged on our side is the member ID (int). We can split / divide the table through the Mantissa of the member ID.
Xiao C: as long as we can use the member ID Mantissa for slicing / subtable, then we can use the modular way to do the slicing / subtable, through the modular way to achieve a good balance. The schematic diagram is as follows:
Boss: mm-hmm, Xiao B and Xiao C say the plan is excellent without considering the cold and hot of the members, but often our members have hot and cold and zombie members, and by taking the model, there is often an unusually high data of a certain shard, and some of the shard data is unusually low, resulting in a balance tilt. The schematic diagram is as follows:
Boss: when a certain fragment / table reaches its limit, we need to add a slice / table, and we find that we can't add a slice / table normally. Because once adding a piece / or table will lead to most of the data confusion, according to the original modeling method is unable to obtain data normally. The schematic diagram is as follows
Before adding the shard / table, the orders of the members are respectively stored on the ID%N BMagne C. When adding the slice / table, when taking the model to get the order data of the members according to (member ID%N), it is found that the order data cannot be obtained, because at this time, the data of the three members are distributed on the DMagne EPersonA. The specific diagram is as follows:
Boss: so there will be some defects in the way of taking the module; all right, let's use the consistent hash principle to solve the sharding / subtable problem.
First of all, what is consistent hashing algorithm? Consistent hashing algorithm (Consistent Hashing Algorithm) is a distributed algorithm, which is often used for load balancing. Memcached client also chooses this algorithm to solve the problem of evenly distributing key-value among many Memcached server. It can replace the traditional mode operation and solve the problem that the mode operation can not deal with the addition and deletion of Memcached Server (adding and deleting server will result in the same key, and the real data storage server cannot be allocated during the get operation, and the hit rate will drop sharply).
Taking the above problem as an example, if we have 10 pieces, we use the Hash algorithm to calculate a hash value for each piece, and these Hash points will be virtually distributed on the Hash circle. The theoretical view is as follows:
According to the clockwise direction, the arc between each point belongs to the capacity of each starting point, and then according to the same Hash calculation method, each member ID is calculated by Hash to get each Hash value, and then fall into the slice / table according to the interval to ensure the uniform distribution of data.
If we need to add a new piece / table (B1) between B and C at this time, it will not cause almost all the data confusion according to the modular form, but only affect the data between (B1 ~ C). In this way, it is more convenient for us to clean it out, and there will not be a large number of data.
Paralyzed.
But if we just calculate the hash value from the slice / table, the distribution of these points is not so uniform, such as the following situation, resulting in interval tilt. As shown in the picture
At this point, virtual nodes are born, so let's take a look at the role of virtual nodes in consistent Hash. When we add several points to the Hash ring, the distance between each point will be nearly equal. According to this idea, we can add several new ones.
Slice / table, but the cost is limited. We participate in the calculation by copying multiple copies of A, B, C ({A1-An}, {B1-Bn}, {C1-Cn}), and distribute the data in a clockwise direction, as shown below:
At this point, A = [A _ 1 ~ C _ 1) & [A _ 1 ~ 2 ~ C _ 2) & [A _ 3 ~ C) & [B _ 3 ~ ~ C _ 3) & [B _ 3 ~ C _ 3) & [B _ 3 ~ 4 ~ C _ 4) & [B _ 1 ~ (A)]; C = [C _ 1 ~ 2) & [C ~ 2 ~ (2)) & [C _ 3 ~ (3)) & [C _ 4 ~ (Magi) A _ 3). It can be seen from the chart that the denser the distribution points are, the better the balance is.
Using System
Using System.Collections.Generic
Using System.Data.HashFunction
Using System.Data.HashFunction.xxHash
Using System.Linq
Namespace HashTest
{
Public class ConsistentHash
{
/ / /
/ / number of virtual nodes
/ / /
Private static readonly int VirturalNodeNum = 10
/ / /
/ / Server IP
/ / /
Private static readonly string [] Nodes = {"192.168.1.1", "192.168.1.2", "192.168.1.3"}
/ / /
/ / grouping according to consistent Hash
/ / /
Private static readonly IDictionary ConsistentHashNodes = new Dictionary ()
Private static uint [] _ nodeKeys = null
Public static void ComputeNode ()
{
Foreach (var node in Nodes)
{
AddNode (node)
}
}
Private static void AddNode (string node)
{
For (int I = 0; I
< VirturalNodeNum; i++) { var key = node + ":" + i; var hashValue = ComputeHash(key); if (!ConsistentHashNodes.ContainsKey(hashValue)) { ConsistentHashNodes.Add(hashValue, node); } } _nodeKeys = ConsistentHashNodes.Keys.ToArray(); } private static uint ComputeHash(string virturalNode) { var hashFunction = xxHashFactory.Instance.Create(); var hashValue = hashFunction.ComputeHash(virturalNode); return BitConverter.ToUInt32(hashValue.Hash, 0); } public static string Get(string item) { var hashValue = ComputeHash(item); var index = GetClockwiseNearestNode(hashValue); return ConsistentHashNodes[_nodeKeys[index]]; } private static int GetClockwiseNearestNode(uint hash) { int begin = 0; int end = _nodeKeys.Length - 1; if (_nodeKeys[end] < hash || _nodeKeys[0] >Hash)
{
Return 0
}
While ((end-begin) > 1)
{
Var mid = (end + begin) / 2
If (_ NodeKeys [mid] > = hash) end = mid
Else begin = mid
}
Return end
}
}
}
These are the consistent Hash principles and applications shared by the editor. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.