How to realize load balancing in web distributed system 07/11 Update SLTechnology News&Howtos

How to realize load balancing in web distributed system

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article introduces the knowledge of "how to achieve load balancing in web distributed system". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

What is "load balancing"?

As shown in the figure, the process of converging traffic by an independent unified entrance and then doing secondary distribution is "load balancing", which, like the "distributed system", is "divide and conquer".

If you are used to using some navigation software when driving, we will find that there is an upper limit on the number of recommended routes in the navigation software, such as 3 or 5. Therefore, in fact, in essence, it also plays a similar role of "load balancing", because if you can only take the unobstructed route of Top3, the route with serious natural congestion cannot be recommended to you, so that the pressure of traffic flow is shared to the relatively idle route.

The same is true in software systems, in order to avoid uneven traffic sharing, resulting in excessive load on local nodes (such as tight CPU, etc.), so an independent unified entrance is introduced to do similar "navigation" work. However, the difference between "load balancing" in software system and navigation is that navigation is a flexible strategy, which ultimately requires users to make a choice, while the former is different.

Behind the balance is the strategy at work, and behind the strategy is made up of some algorithms or logic. For example, the algorithm in navigation belongs to the category of "path planning", which is subdivided into "static path planning" and "dynamic path planning". Moreover, there are a variety of specific algorithms under different branches, such as Dijikstra, A* and so on. Similarly, load balancing in software systems, there are many algorithms or logic to support these strategies, coincidentally, there are also static and dynamic.

Second, the diagram of commonly used "load balancing" strategy.

Here's a list of the five most common strategies in your daily work.

01 polling

This is the most commonly used and simplest strategy, evenly distributed by everyone, once per person. The approximate code is as follows.

Int globalIndex = 0; / / Note that it is a global variable, not a local variable. Try {return servers [globalIndex];} finally {globalIndex++; if (globalIndex = = 3) globalIndex = 0;} 02 weighted polling

On the basis of polling, a concept of weight is added. Weight is a generalized concept, which can be reflected in any way. In essence, it is a thought that those who can do more work. For example, you can configure different weights according to the performance differences of the host. The approximate code is as follows.

Int matchedIndex =-1 int total = 0 for (int I = 0; I)

< servers.Length; i++){ servers[i].cur_weight += servers[i].weight;//①每次循环的时候做自增（步长=权重值） total += servers[i].weight;//②将每个节点的权重值累加到汇总值中 if (matchedIndex == -1 || servers[matchedIndex].cur_weight < servers[i].cur_weight) //③如果当前节点的自增数 >

The self-increment of the node to be returned is overwritten. {matchedIndex = I;}} servers.cur _ weight-= total;// ④ the selected node minus the summary value of the ② to reduce the initial weight value at the next election. Return servers [matchedIndex]

The process of this code is shown in the table in the figure. The number in "()" is the self-increment, the cur_weight in the code.

It is worth noting that weighted polling itself can be implemented in different ways, although the final ratio is 2:1:2. But there can be all differences in the order in which requests are delivered. For example, compared with the above case, the final proportion of "5-4, 3, 3, 2-1" is the same, but the effect is different. "5-4, 3, 2-1" is more likely to cause concurrency problems, leading to server congestion, and this problem becomes more serious with the increase of the weight number. Example: the result at 10:5:3 is "18-17-16-15-14-13-12-11-10-9, 8-7-6-5-4, 3-2-1."

03 minimum number of connections

This is a way of dynamic load balancing according to the real-time load situation. Maintain the number of connections in the activity, and then return to the minimum. The approximate code is as follows.

Var matchedServer = servers.orderBy (e = > e.active_conns). First (); matchedServer.active_conns + = 1 switch return matchedServer;// also needs to subtract 1 from active_conns when the connection is closed. 04 fastest response

This is also a dynamic load balancing strategy, its essence is to allocate according to the response of each node in the past period of time, the faster the response, the more allocation. There are also many ways to operate, and the above figure can be understood as recording the average time spent on requests in the most recent period, combined with the previous "weighted polling", so it is equivalent to 2:1:3 weighted polling.

Digression: generally speaking, the delay under the computer room is basically the same as that under the computer room, and the difference in response time is mainly in the processing capacity of the service. If it is used in some request processing across regions (for example, Zhejiang-> Shanghai or Zhejiang-> Beijing), in most cases, the timed "ping" method will be used to obtain the delay, because the L3 forwarding of OSI makes the data cleaner and more accurate.

05 Hash method

The load balancing of the hash method is different from the previous ones in that its result is determined by the client. A certain identity brought by the client is scattered and apportioned by a standardized hash function.

The hash function in the figure above uses the simplest and roughest "remainder method".

Aside from the topic: in addition to the remainder of the hash function, there are other hash functions, such as "changing base", "folding", "square middle method" and so on.

In addition, the parameter to be solved can be arbitrary, as long as it is finally converted into an integer to participate in the operation. The most common use should be to take the source ip address as a parameter to ensure that the same client requests fall on the same server as much as possible.

Third, the advantages, disadvantages and applicable scenarios of common "load balancing" strategies.

We know that there is no perfect thing, and so is the load balancing strategy. The most commonly used strategies listed above also have their own advantages and disadvantages and applicable scenarios, which I have sorted out a bit, as follows.

The reason why these load balancing algorithms are commonly used is also because it is simple. If you want a better effect, you must need higher complexity. For example, you can use a combination of simple strategies, or comprehensive evaluation through more dimensional data sampling, or even based on prediction algorithms after data mining.

Fourth, use "health detection" to ensure high availability

No matter what kind of strategy it is, it is inevitable to encounter machine failure or program failure. Therefore, in order to ensure that load balancing can work better, we also need to combine some "health detection" mechanisms. Regularly detect whether the server can still be connected and whether the response is slower than expected. If the node is in an "unavailable" state, it needs to be temporarily removed from the waiting list to improve availability. There are three commonly used ways of "health detection".

01 HTTP probe

Use Get/Post to request a fixed URL on the server to determine whether the returned content is in line with expectations. Generally use the Http status code, the content in response to judge.

02 TCP probe

Based on Tcp's three-way handshake mechanism to detect the specified IP + port. Best practices can learn from Aliyun's SLB mechanism, as shown in the following figure.

It is worth noting that in order to release the connection as soon as possible, follow the RST immediately after the end of the three-way handshake to break the TCP connection.

03 UDP probe

There may be UDP protocols used by some applications. Under this protocol, the specified IP + port can be detected by message. Best practices can also learn from Aliyun's SLB mechanism, as shown in the following figure.

The way to determine the result is: if the server does not return any information, the default normal state. Otherwise, an error message of ICMP will be returned.

This is the end of the content of "how to realize the load balancing of web distributed system". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.