Load balancing Strategy of Nginx and removal of Common Fault nodes 07/04 Update SLTechnology News&Howtos

Load balancing Strategy of Nginx and removal of Common Fault nodes

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

The following brings you the load balancing strategy of Nginx and the removal of commonly used fault nodes, hoping to bring some help to you in practical application. There are many things involved in load balancing, not many theories, and there are many books on the Internet. Today, we will use the accumulated experience in the industry to do an answer.

Abstract: this paper introduces the load balancing strategy of Nginx, the principle of consistent hash allocation, and the common fault node removal and recovery configuration.

The previous Nginx topic (1): reverse proxy and configuration of Nginx introduces in detail one of the functions of Nginx-reverse proxy. This article will focus on the second function of Nginx-load balancing.

In order to increase the favor of load balancing, let's first understand what load balancing can achieve.

Bind multiple CVM nodes together to provide a unified service entry. Failover, in the event of an accident, can increase a layer of insurance and reduce losses. Reduce the complexity of online operation and maintenance to achieve smooth launch. The students of Yun and developer all like it.

Let's officially get to the point.

First, the load balancing strategy of Nginx

Load balancing is the "balanced" distribution of requests to multiple business node servers. The "equilibrium" here is based on the actual scenario and business needs.

For Nginx, the request arrives at Nginx,Nginx as a reverse proxy server and has absolute decision-making power. According to the rules, the request can be assigned to one of the nodes it knows. Through this distribution, the number of requests that all nodes need to deal with is relatively average, so as to achieve load balancing.

Nginx supports many load balancing strategies. The key ones are as follows:

Round robin (polling) random (random) weight (weight) fair (by response time, tripartite plug-in) url_hash (hash value of url) ip_hash (hash value of ip) least_conn (minimum number of connections)

So many strategies are not conducive to memory and choice, we might as well classify these common strategies and divide them into different ones to facilitate selection.

The first kind of best implementation weight (weight) random (random)

The best practice is actually the most common, most common default configuration, which is, of course, the best configuration to some extent. When you don't know what way to use, you can choose to use this type.

Polling needless to say. Random here, in fact, in the case of a large number of requests, according to the theory of probability is equivalent to the way of polling.

Polling configuration reference:

# the default configuration is polling policy upstream server_group {server backend1.example.com; server backend2.example.com;}

Random configuration reference:

Upstream server_group {random; server backend1.example.com; server backend2.example.com; server backend3.example.com; server backend4.example.com;} second performance first weight (weight) fair (by response time, three-party plug-ins) least_conn (minimum number of connections)

It is also a better allocation strategy to let the more powerful machines in the business node get more requests.

What is a better machine? This question also has many dimensions to consider.

It is divided into high-weight and low-weight machines in terms of experience or hardware. According to the response time of the node request, we decide whether to allocate more requests or less requests. According to the number of connections maintained. In general, the more connections are maintained, the more tasks are handled and the busiest, and requests can be assigned to other machines for processing.

Reference for the configuration of weights:

Upstream server_group {server backend1.example.com weight=5; # defaults to not configuring a weight of 1 server backend2.example.com;}

Response duration (fair) configuration reference: a nginx-upstream-fair module needs to be added when Nginx is compiled.

Upstream server_group {fair; server backend1.example.com; server backend2.example.com; server backend3.example.com; server backend4.example.com;}

Minimum number of connections (least_conn) configuration reference:

Upstream server_group {least_conn; server backend1.example.com; server backend2.example.com;} the third type of stable ip_hashurl_hash

Many requests are stateful, with which business node the last request went to and which machine to request this time. For example, the common session is such a stateful business.

Here Nginx provides rules for allocating users according to the hash of the client ip and the hash of the url as the allocation tag. In essence, it is necessary to find the immutable elements in the user's request and extract them so that they can be allocated.

Ip_hash configuration reference:

Upstream server_group {ip_hash; server backend1.example.com; server backend2.example.com;}

Url_hash configuration reference:

Upstream server_group {hash $request_uri consistent; server backend1.example.com; server backend2.example.com; server backend3.example.com; server backend4.example.com;} II. Nginx supports consistent hash allocation

Nginx supports the allocation of consistent hash, that is, consistent in configuration.

What is a consistent hash? Why introduce this mechanism? In the production environment, business nodes often increase or decrease, even if the increase or decrease is passive, it may have an impact on hash allocation. How can we minimize the impact? This is when the consistent hash was invented.

Consistent hash addresses two issues:

The distribution is particularly uneven; node changes not only affect requests assigned to this node, but also cause requests on other nodes to be redistributed.

1) how to solve the problem of uneven distribution

Copy each of the original nodes into N virtual nodes, and give these virtual nodes a name.

For example, there used to be 5 nodes, and the distribution is often uneven. Now each node has a virtual N nodes, that is, 5 million N nodes, which will greatly reduce the uneven distribution. Now let's talk about how to distribute it.

2) how to solve the problem of node change

The basic idea of consistent hash:

Define a numerical space of [0, (2 ^ 32)-1]. It is equivalent to a line segment with a length from 0 to 2 ^ 32-1. Nodes are mapped to segments. Each node, including the virtual node, gets the numerical value through the hash algorithm, and then maps to this value interval.

As shown in the following picture.

Calculate the hash value of the data. Get a numerical value from the key string in the request through the hash algorithm, and find a position in the line segment. If the calculated value is greater than 2 ^ 32-1, it is considered to be 0. According to this rule, this segment is actually connected end to end to form a ring, so it is also called hash ring. The data node looks for the home node on the line segment. Along this segment to the right to find the nearest node, and take this node as the home node of this data.

Let's take a look at the impact of node changes on consistent Hash.

Node reduction: for example, when NodeA suddenly fails, the data originally allocated to other nodes will not change, and only the data allocated to NodeA will refind the nearest point, thus reducing the number of hash redistribution. This is also the biggest advantage of consistent hash. The number of nodes increases: for example, the number of requests is now increasing, and it is necessary to increase the number of nodes to reduce the load. When a new node NodeE is added, NodeE and a group of its virtual nodes are assigned to the hash ring according to the hash value. At this time, most of the data will not change according to the consistent hash rules to find the Node nodes to which they belong, only those values are calculated to find that the data closer to the NodeE has changed, but the number is limited, reducing the impact caused by the increase of nodes. III. Removal and recovery of faulty nodes

Take a look at the classic configuration before explaining it in detail.

Upstream server_group {server backend1.example.com; server backend2.example.com max_fails=3 fail_timeout=30s; server backup1.example.com backup;} max_fails=number

This parameter determines how many times the business node will be suspended after a request backend failure, and no new requests will be sent to it. The default value is 1. This parameter needs to be used in conjunction with fail_timeout.

Digression: there are many types of how to define failure. Here we pay more attention to proxy_next_upstream because we mainly deal with HTTP agents.

Proxy_next_upstream: mainly defines that when something happens to the service node, the request will be sent to other nodes, that is, it defines how to count as a business node failure.

Fail_timeout=time

Determines how long to pause when Nginx determines that the node is unavailable. If it is not configured, the default is 10s.

The above two parameters are considered together: when Nginx finds that the request sent to this node has failed 3 times, it will remove the node, and the removal time is 30s before sending the request to this node again.

Backup

Similar to the default in the switch statement, when the primary node is down, the request is called to the backup node. This is the last reinforcements.

After looking at the above Nginx load balancing strategy and the solution to the removal of common fault nodes, if you have anything else you need to know, you can find what you are interested in in the industry information or find our professional and technical engineer to answer it. Technical engineers have more than ten years of experience in the industry. Official website link www.yisu.com

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.