How to analyze Nginx load balancing strategy 07/01 Update SLTechnology News&Howtos

How to analyze Nginx load balancing strategy

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article is about how to parse the Nginx load balancing strategy. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

1 preface

With the increasing load of the website, load balancing (load balance) is no longer an unfamiliar topic. Load balancing is to distribute the traffic load to different service units to ensure the high availability of the server, to ensure that the response is fast enough, and to give users a good experience.

The public version of nginx*** was released in 2004. Version 1.0 was released in 2011. It is characterized by high stability, powerful function and low resource consumption. From the perspective of server market share, nginx already has the momentum to compete with Apache. Among them, the feature that has to be mentioned is its load balancing function, which has become the main reason why many companies choose it.

We will introduce the built-in load balancing strategy and extended load balancing strategy of nginx from the perspective of source code. Taking the actual industrial production as an example, we will compare various load balancing strategies to provide some references for nginx users.

two。 Source code analysis

The load balancing strategy of nginx can be divided into two categories: built-in strategy and extended strategy.

The built-in policies include weighted polling and ip hash, which are compiled into the nginx kernel by default, simply by specifying the parameters in the nginx configuration. There are many extension strategies, such as fair, generic hash, consistent hash, etc., which are not compiled into the nginx kernel by default.

Since there is no essential change in the load balancer code in the nginx version upgrade, the following will take the nginx1.0.15 stable version as an example to analyze each strategy from the perspective of source code.

2.1. Weighted polling (weighted round robin)

The principle of polling is very simple. First of all, let's introduce the basic process of polling. The following is a flowchart for processing a request:

There are two points to note in the figure:

*. If the weighted polling algorithm can be divided into deep search first and wide search first, then nginx uses the first deep search algorithm, that is, it first allocates all requests to highly weighted machines, and does not start to assign requests to the next high weighted machines until the weight of the machine is lower than that of other machines.

Second, when all back-end machines are down off, nginx will immediately clear the flag bits of all machines to the initial state, so as to avoid causing all machines to be in the state of timeout, resulting in the entire front end being rammed.

Next, take a look at the source code. The directory structure of nginx is very clear, and the path of weighted polling is nginx-1.0.15/src/http/ngx_http_upstream_round_ Robin. [C | h]. Based on the source code, I have added comments for the important and difficult to understand. First, take a look at the important announcement in ngx_http_upstream_round_robin.h:

Its function can be roughly guessed from variable naming. Explain the difference between current_weight and weight, the former is the weight sorted value, which changes dynamically as the request is processed, and the latter is the configuration value, which is used to restore the initial state.

Next, let's take a look at the creation process of polling. The code is shown below:

Here is a tried variable that needs to be explained: the tried records whether the server has currently been attempted to connect. He is a bitmap. If the number of servers is less than 32, all server states can be recorded in only one int. If the number of servers is greater than 32, you need to request memory in the memory pool to store it.

For the use of this bitmap array, refer to the following code:

* is the actual policy code with simple logic and only 30 lines of code implementation. Look at the code.

2.2. Ip hash strategy

Ip hash is another load balancing strategy built into nginx, and the process is similar to polling, except that the algorithm and specific strategy are somewhat different. As shown in the following figure:

For the core implementation of the ip hash algorithm, see the following code:

As you can see, the hash value is related to both ip and the number of back-end machines. After testing, the above algorithm can generate 1045 different value successively, which is the hard limit of this algorithm. Nginx uses a protection mechanism, and when hash still cannot find an available machine after 20 times, the algorithm degenerates to polling.

Therefore, in essence, the ip hash algorithm is a disguised polling algorithm. If the initial hash values of the two ip are exactly the same, then the requests from the two ip will always fall on the same server, which lays a deep hidden danger for the balance.

2.3. Fair

The fair policy is an extension policy and is not compiled into the nginx kernel by default. It judges the load according to the response time of the back-end server and selects the machine with the lightest load for diversion.

This strategy has strong self-adaptability, but the actual network environment is often not so simple, so it should be used with caution.

2.4. General hash, consistent hash

Generic hash and consistent hash are also extension strategies. The general hash can use the built-in variable key of nginx to hash, and the consistent hash uses the consistent hash ring built-in by nginx to support memcache.

(3) comparison test

Now that we understand the above load balancing strategy, let's do some tests.

The main purpose of this paper is to compare the balance, consistency and disaster tolerance of each strategy, so as to analyze the differences and give their respective applicable scenarios according to the data.

In order to test the load balancing strategy of nginx comprehensively and objectively, we use two testing tools to test in different scenarios, so as to reduce the impact of the environment on the test results.

First of all, I will introduce the testing tools, the testing network topology and the basic testing process.

3.1 testing tools

3.1.1 easyABC

EasyABC is a performance testing tool developed by Baidu internally. The training is implemented by epool model, which is simple and easy to use. It can simulate GET/POST requests. In extreme cases, it can provide tens of thousands of pressures, and is widely used within the team.

Because the object under test is a reverse proxy server, it is necessary to build a pile server at its back end. Here, nginx is used as a pile Web Server to provide the most basic static file services.

3.1.2 polygraph

Polygraph is a free performance testing tool that is good at testing cache services, proxies, switches, etc. It has a standard configuration language PGL (Polygraph Language), which provides powerful flexibility for the software. Its working principle is shown in the following figure:

Polygraph provides client side and server side, and the test target nginx is placed between them. The network interaction between the three is based on http protocol. You only need to configure ip+port.

The Client side can configure the number of virtual robot and the rate at which each robot sends requests, and initiate random static file requests to the proxy server. The server side will generate random static files in response according to the requested url.

One of the main reasons for choosing this test software is that random url can be generated as the key of various nginx hash strategies.

In addition, polygraph also provides log analysis tools, which are rich in functions. Interested students can refer to the appendix materials.

3.2. Test environment

This test is run on 5 physical machines. Among them: the tested object was built on an 8-core machine, and the other four 4-core machines built easyABC, webserver post and polygraph respectively. As shown in the following figure:

3.3. Test scheme

Let me introduce you to the key test indicators:

Balance: whether the request can be sent evenly to the backend

Consistency: whether the request of the same key can fall on the same machine

Disaster tolerance: whether it can work properly when some back-end machines hang up

Under the guidance of the above metrics, we use easyABC and polygraph to test the following four test scenarios:

Scenario 1 server_* provides services normally

Scene 2 server_4 hangs up, other normal

Scenario 3 server_3, server_4 hang up, other normal

In scenario 4, the server_* returns to normal service.

The above four scenarios will be carried out in chronological order, and each scene will be based on the previous scenario. The tested object does not need to do anything to simulate the actual situation at a * * level.

In addition, considering the characteristics of the testing tool, the test pressure is about 17000 on easyabc and about 4000 on polygraph. The above tests ensure that the tested object can work normally, and there are no logs above notice level (alert/error/warn). In each scenario, the qps of server_* is recorded for policy analysis.

3.4. Result

Comparing the test results under the two testing tools, we will find that the results are exactly the same, so the influence of the testing tools can be eliminated. Table 1 and figure 1 show the load of the polling strategy under the two test tools.

It can be seen from the chart that the polling strategy can satisfy both balance and disaster tolerance.

Table 2 and figure 2 show the load of the fair strategy under the two test tools. The fair strategy is greatly affected by the environment, and after eliminating the interference of the testing tools, the result still has a very large jitter.

Intuitively, this does not satisfy the equilibrium at all. But from another point of view, it is precisely because this adaptability ensures the best use of things in a complex network environment. Therefore, the testing work needs to be done in the specific environment before it is applied to industrial production.

The following chart shows the various hash strategies, the only difference is hash key or specific algorithm implementation, so make a comparison together. In the actual test, it is found that there is a problem in both general hash and consistent hash: when a back-end machine dies, the traffic that falls on this machine will be lost, but there is no such problem in ip hash.

As in the above analysis of the ip hash source code, when the ip hash fails, it will degenerate into a polling policy, so there will be no loss of traffic. At this level, ip hash can also be seen as an updated version of polling.

Figure 5 shows the ip hash policy. Ip hash is the built-in policy of nginx and can be seen as a special case of the first two policies: the source IP is key.

Since the testing tool is not very good at simulating requests under massive IP, the actual situation on the line is intercepted and analyzed here. As shown in the following figure:

Figure 5 IP Hash strategy

The first one in the figure uses the polling policy, the middle segment uses the ip hash policy, and the last one, the second one, still uses the polling policy. It is obvious that there is a big problem with the balance of ip hash.

The reason is not difficult to analyze, in the actual network environment, there are a large number of university exit router ip, enterprise exit router ip and other network nodes, the traffic brought by these nodes is often hundreds of times that of ordinary users, and the ip hash strategy is to divide the traffic according to ip, so the above consequences are natural.

4 summary

Through the actual comparative test, we verify each load balancing strategy of nginx. The following compares various strategies from the perspective of balance, consistency, disaster tolerance and applicable scenarios. The following figure is shown:

We analyze and explain the strategy of nginx load balancing from the point of view of source code and actual test data, and give the appropriate application scenarios of various strategies. Through the analysis, it is not difficult to find that no matter which strategy is not a panacea, which strategy should be chosen in a specific scenario depends to a certain extent on the user's familiarity with the strategy.

Thank you for reading! This is the end of this article on "how to analyze Nginx load balancing strategy". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.