Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to analyze the working principle of load artifact LVS, Nginx and HAProxy

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces how to analyze the working principle of load artifact LVS, Nginx and HAProxy. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.

At present, most Internet systems use server cluster technology, which deploys the same services on multiple servers to provide services as a whole. These clusters can be Web application server clusters, database server clusters, distributed cache server clusters and so on.

In practical applications, there will always be a load balancer server before the Web server cluster. The task of the load balancer device is to act as the entrance of Web server traffic, select the most suitable Web server, forward the client's request to it for processing, and achieve transparent forwarding from the client to the real server.

Cloud computing and distributed architecture, which are very popular in recent years, essentially use back-end servers as computing resources and storage resources, which are encapsulated by a management server into a service for external provision. the client does not need to care about which machine is really providing the service, as if it is facing a server with almost unlimited capacity, and in essence, it is the back-end cluster that really provides the service.

LVS, Nginx and HAProxy are the three most widely used software load balancing software at present.

Generally, the use of load balancing is to use different technologies according to different stages with the increase of the size of the website. The specific application requirements also need to be analyzed in detail. If it is a small and medium-sized Web application, for example, if the daily PV is less than 10 million, you can use Nginx completely; if there are many machines, you can use DNS polling, and LVS consumes more machines; when there are many large websites or important services, and there are many servers, you can consider using LVS.

At present, there is a more reasonable and popular architecture scheme about the website architecture: the Web front end uses Nginx/HAProxy+Keepalived as the load balancer; the back end uses MySQ L database with one master, multi-slave and read-write separation, and adopts the architecture of LVS+Keepalived.

1. LVS

LVS is the abbreviation of Linux Virtual Server, which means Linux virtual server. LVS is now a part of the standard kernel of Linux. Since the Linux2.4 kernel, various functional modules of LVS have been completely built in. There is no need to patch the kernel, and you can directly use the various functions provided by LVS.

LVS has been a relatively mature technical project since 1998.

1. The architecture of LVS

The server cluster system set up by LVS consists of three parts:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

The front-end load balancing layer, represented by Load Balancer

The server cluster layer in the middle, represented by Server Array

The bottommost data sharing storage layer, represented by Shared Storage

2. LVS load balancing mechanism

Unlike HAProxy and other layer-7 soft loads, LVS is designed for HTTP packets, so LVS cannot do URL parsing and other tasks that can be done by layer-7 loads.

LVS is a four-layer load balancing, that is to say, it is built on the transport layer, the fourth layer of the OSI model, on which we are familiar with the load balancing of TCP/UDP,LVS supporting TCP/UDP. Because LVS is a layer-4 load balancer, it is very efficient compared with other high-level load balancing solutions, such as DNS domain name rotation resolution, application layer load scheduling, client scheduling and so on.

The so-called four-layer load balancing is mainly through the target address and port in the message. Seven-layer load balancing, also known as "content switching", is mainly through the truly meaningful application layer content in the message.

The forwarding of LVS is mainly realized by modifying the IP address (NAT mode, which is divided into source address modification SNAT and destination address modification DNAT) and modifying destination MAC (DR mode).

3. NAT mode: network address translation

NAT (Network Address Translation) is a technology of address mapping between external network and internal network.

In NAT mode, the entry and exit of network datagrams have to be processed by LVS. LVS needs to act as a gateway for RS (real server).

When the packet arrives at the LVS, LVS does the destination address translation (DNAT), changing the destination IP to the IP of RS. After receiving the packet, RS seems to have sent it directly from the client. When the RS is processed and the response is returned, the source IP is RS IP and the destination IP is the IP of the client. At this point, the RS packet is transferred through the gateway (LVS), and the LVS will do the source address translation (SNAT) to change the source address of the packet to VIP, so that the packet looks to the client as if LVS returned it directly.

4. DR mode: direct routing

In DR mode, LVS and RS clusters need to be bound to the same VIP (RS is implemented by binding VIP to loopback), but unlike NAT, the request is accepted by LVS and returned directly to the user by the real service server (RealServer,RS) without going through LVS.

In detail, when a request comes, the LVS only needs to change the MAC address of the network frame to the MAC of a certain RS, and the packet will be forwarded to the corresponding RS for processing. Note that the source IP and destination IP remain the same at this time, and the LVS just transforms the tree. When RS receives the packet forwarded by LVS, the link layer finds that MAC is its own, and when it goes to the upper network layer, it finds that IP is also its own, so the packet is legally accepted, and RS is not aware of the existence of LVS. When the RS returns a response, it just returns directly to the source IP (that is, the user's IP) without going through the LVS.

In DR load balancer mode, the IP address is not modified during data distribution, only the mac address is modified. Since the actual physical IP address of the request is the same as the destination IP address of the data request, there is no need for address translation through the load balancer server, and the response packet can be returned directly to the user's browser to prevent the bandwidth of the load balancer server network card from becoming a bottleneck. Therefore, DR mode has good performance, and it is also the most widely used load balancing method for large websites at present.

5. Advantages of LVS

Strong anti-load ability, working on the transport layer only for distribution, no traffic generation, this characteristic also determines that it has the strongest performance in the load balancing software, and has relatively low consumption of memory and cpu resources.

Configuration is relatively low, which is both a disadvantage and an advantage, because there is not much to configure, so it does not need too much contact, which greatly reduces the probability of human error.

The work is stable, because it has strong anti-load ability, and it has a complete dual-computer hot backup scheme, such as LVS + Keepalived.

There is no traffic, LVS only distributes requests, and traffic does not go out from itself, which ensures that the performance of equalizer IO will not be affected by large traffic.

LVS has a wide range of applications, because it works in the transport layer, so it can load balance almost all applications, including http, database, online chat rooms and so on.

6. Shortcomings of LVS

The software itself does not support regular expression processing, can not do dynamic and static separation, and now many websites have a strong demand in this area, which is the advantage of Nginx, HAProxy + Keepalived.

If the website application is relatively large, the implementation of LVS/DR + Keepalived is more complex, relatively speaking, Nginx / HAProxy + Keepalived is much simpler.

II. Nginx

Nginx is a powerful Web server software for handling highly concurrent HTTP requests and load balancing as a reverse proxy server. It has the advantages of high performance, lightweight, low memory consumption, strong load balancing ability and so on.

1. The architecture design of Nignx

In contrast to the traditional process or thread-based model (Apache uses this model), a separate process or thread is established for each connection when dealing with concurrent connections and is blocked during network or input / output operations. This results in significant memory and CPU consumption because a new separate process or thread needs to prepare a new runtime environment, including heap and stack memory allocation, as well as a new execution context, which, of course, results in extra CPU overhead. Eventually, server performance deteriorates due to too much context switching.

The architecture of Nginx, in turn, is modular, event-driven, asynchronous, single-threaded, and non-blocking.

Nginx uses a lot of multiplexing and event notification. After Nginx is started, it will run in the background as daemon in the system, including a master process, n (n > = 1) worker processes. All processes are single-threaded (that is, there is only one main thread), and interprocess communication mainly uses shared memory.

Among them, the master process is used to receive signals from the outside world, send signals to the worker process, and monitor the working status of the worker process. The worker process is the real processor of external requests, and each worker request is independent of each other and equally competes for requests from the client. Requests can only be processed in one worker process, and a worker process has only one main thread, so only one request can be processed at the same time. (the principle is very similar to Netty.)

2. Nginx load balancing

Nginx load balancing mainly supports http and https on the application layer of the seventh layer of the seven-layer network communication model.

Nginx is load balanced in the way of reverse proxy. Reverse proxy (Reverse Proxy) means that the proxy server accepts the connection request on the Internet, then forwards the request to the server on the internal network, and returns the result obtained from the server to the client requesting the connection on the Internet. At this time, the proxy server is externally represented as a server.

There are many strategies for Nginx to allocate load balancers. Upstream of Nginx currently supports the following methods:

Polling (default): each request is assigned to a different backend server one by one in chronological order. If the backend server down is dropped, it can be automatically eliminated.

Weight: specifies the polling probability. Weight is proportional to the access ratio and is used in situations where the performance of the back-end server is uneven.

Ip_hash: each request is allocated according to the hash result of accessing the ip, so that each guest accesses a back-end server on a regular basis, which can solve the session problem.

Fair (third party): requests are allocated according to the response time of the back-end server, and priority is given to those with short response time.

Url_hash (third party): allocates requests according to the hash result of accessing url, so that each url is directed to the same backend server, which is more effective when the backend server is cached.

3. Advantages of Nginx

Cross-platform: Nginx can be compiled and run in most Unix like OS, and there is also a portable version of Windows

The configuration is extremely simple: very easy to use. Configuration style is the same as program development, divine configuration

Non-blocking, highly concurrent connections: official tests can support 50,000 concurrent connections, reaching 20,000 to 30,000 concurrent connections in the actual production environment.

Event driven: the communication mechanism uses the epoll model to support larger concurrent connections

Master/Worker structure: a master process that generates one or more worker processes

Small memory consumption: very small memory consumption for processing large concurrent requests. With 30 000 concurrent connections, 10 Nginx processes open consume only 150 MB of memory (15M*10=150M)

Built-in health check function: if a Web server on the backend of the Nginx agent goes down, the front-end access will not be affected.

Bandwidth savings: GZIP compression is supported, and Header headers can be added to the browser's local cache

High stability: for reverse agents, the probability of downtime is minimal

4. Shortcomings of Nginx

Nginx can only support http, https, tcp, Email and other protocols, so it is less applicable, which is its disadvantage.

Health check for backend servers can only be checked by port, not by ur l. Direct retention of Session is not supported, but it can be solved through ip_hash

III. HAProxy

HAProxy supports two proxy modes, TCP (layer 4) and HTTP (layer 7), as well as virtual hosts.

The advantages of HAProxy can complement some of the disadvantages of Nginx, such as supporting the retention of Session, the boot of Cookie, and the ability to detect the status of the back-end server by getting the specified url.

HAProxy is similar to LVS in that it is only a load balancing software. In terms of efficiency, HAProxy has a better load balancing speed than Nginx and is better than Nginx in concurrent processing. HAProxy supports load balancer forwarding of TCP protocol. It can load balance MySQL reads, detect and load balance backend MySQL nodes. You can use LVS+Keepalived to load balance MySQL master / slave.

There are many HAProxy load balancing strategies: Round-robin (round robin), Weight-round-robin (weighted round robin), source (original address retention), RI (request URL), rdp-cookie (according to cookie).

On how to analyze the load artifact LVS, Nginx and HAProxy working principle is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

  • A small hole in the Django template.

    Navigation bar

    © 2024 shulou.com SLNews company. All rights reserved.

    12
    Report