In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces how to realize the virtual server in the Linux server cluster system, which is very detailed and has a certain reference value. Interested friends must read it!
Related methods to implement virtual services:
In the network service, one end is the client program, the other is the service program, and there may be an agent program in the middle. From this point of view, the load balancing of multiple servers can be achieved at different levels. The existing methods of using clusters to solve network service performance problems are mainly divided into the following four categories.
1. Solution based on RR-DNS
NCSA's scalable WEB server system is the earliest prototype system based on RR-DNS (Round-Robin Domain Name System). Its structure and workflow are shown in the following figure:
Figure 1: RR-DNS-based scalable WEB server
There is a set of WEB servers that share all HTML documents through the distributed file system AFS (Andrew File System). This group of servers have the same domain name (such as www.ncsa.uiuc.edu). When users access according to this domain name, the RR-DNS server will resolve the domain name to different IP addresses of this group of servers in turn, thus distributing the access load to each server.
This approach brings several problems. * Domain name server is a distributed system, which is organized according to a certain hierarchy. When the user submits the domain name resolution request to the local domain name server, it will submit it to the higher-level domain name server because it cannot be resolved directly, and the higher-level domain name server will submit it upward in turn until the RR-DNS domain name server resolves the domain name to the IP address of one of the servers. It can be seen that there are multiple domain name servers from the user to the RR-DNS, and they all buffer the mapping of the resolved name to the IP address, which will cause all users under the domain name server group to access the same WEB server, resulting in a serious load imbalance between different WEB servers. In order to ensure that the mapping from domain name to IP address in the domain name server is not buffered for a long time, RR-DNS sets a TTL (Time To Live) value on the mapping between domain name and IP address. After a period of time, the domain name server eliminates the mapping from the buffer. When the user requests, it will submit the request to the domain name server at a higher level and remap it. This involves how to set the TTL value, if this value is too large, during this TTL, many requests will be mapped to the same WEB server, which will also lead to serious load imbalance. If this value is too small, for example, 0, it will cause the local domain name server to submit requests to RR-DNS frequently, increase the network traffic of domain name resolution, and also make the RR-DNS server become a new bottleneck in the system.
Second, the user machine buffers the mapping from the name to the IP address without being affected by the TTL value, and the user's access request is sent to the same WEB server. Due to the sudden and different access methods of user access requests, for example, some people leave after a visit, while others can visit for as long as several hours, so the load among the servers is still Skew and can not be controlled. Assuming that the average number of requests per session is 20, the average number of requests received by servers loaded with * is more than 30% higher than the average number of requests per server. In other words, when the TTL value is 0, there will also be a serious load imbalance because of the burst of user access.
Third, the reliability and maintainability of the system are poor. If a server fails, the user who resolves the domain name to the server will see the service interrupted, even if the user presses the "Reload" button, it will not help. Nor can the system administrator cut out a server for system maintenance at any time, such as upgrading the operating system and application software, which requires modifying the list of IP addresses in the RR-DNS server, crossing out the IP address of the server, and then waiting a few days or more for all domain name servers to eliminate the mapping of the domain name to this server. And all clients mapped to this server no longer use the site.
two。 Client-based solution
The client-based solution requires that each client program has a certain knowledge of server cluster, and then send the request to different servers in a load-balanced way. For example, when a Netscape Navigator browser visits the home page of Netscape, it randomly picks the Nth from more than a hundred servers and sends the request to wwwN.netscape.com. However, this is not a good solution. Netscape just uses its Navigator to avoid the trouble of RR-DNS parsing, which is inevitable when using other browsers such as IE.
Smart Client [3] is another client-based solution made by Berkeley. The service provides a Java Applet that runs in the client browser. Applet sends requests to each server to collect information such as server load, and then sends customer requests to the corresponding server according to this information. High availability is also implemented in Applet, where the Applet forwards the request to another server when the server is not responding. The transparency of this method is not good, Applet query to each server to collect information will increase additional network traffic, does not have universal applicability.
3. Solution of load balancing scheduling based on Application layer
Multiple servers are connected to a cluster system through a high-speed Internet, and there is a load scheduler based on the application layer at the front end. When the user access request arrives at the scheduler, the request is submitted to the application for load balancing scheduling, analyzes the request, selects a server according to the load of each server, rewrites the request and accesses the selected server, and then returns the result to the user.
Typical representatives of application layer load balancing scheduling are Zeus load scheduler [4], pWeb [5], Reverse-Proxy [6] and SWEB [7] and so on. Zeus load scheduler is a commercial product of Zeus Company. It is rewritten in Zeus Web server program and adopts a single-process event-driven server structure. PWeb is a parallel WEB scheduler based on Apache 1.1 server program. when a HTTP request arrives, pWeb selects a server, rewrites the request and sends the rewritten request to the server, and then forwards the result to the customer. Reverse-Proxy uses the Proxy module and Rewrite module in Apache 1.3.1 to implement a scalable WEB server. The difference between it and pWeb is that it first looks up the cache of Proxy. If it does not have this copy, it selects a server, sends a request to the server, and then forwards the results returned by the server to the customer. SWEB uses the redirect error code in HTTP to send the customer request to a WEB server, and the WEB server processes the request according to its own load, or leads the customer to another WEB server through the redirect error code to achieve a scalable WEB server.
There are also some problems in the multi-server solution based on application layer load balancing scheduling. *, the processing cost of the system is so high that the scalability of the system is limited. When the request arrives at the load balancer scheduler to the end of processing, the scheduler needs to perform four context switching and memory replication from core to user space or from user space to core space; two TCP connections are required, once from the user to the scheduler and the other from the scheduler to the real server; and the request needs to be analyzed and rewritten. These processes require a lot of resource overhead such as CPU, memory and network, and take a long time to process. The performance of the system can not be close to the linear increase, when the general server group increases to 3 or 4, the scheduler itself may become a new bottleneck. Therefore, the scalability of this method based on application layer load balancing scheduling is extremely limited. Second, the load balancing scheduler based on the application layer needs to write different schedulers for different applications. The above systems are all based on HTTP protocol. For FTP, Mail, POP3 and other applications, it is necessary to rewrite the scheduler.
4. Solution of load balancing scheduling based on IP layer
When a user accesses the service through a virtual IP address (Virtual IP Address), the message of the access request will arrive at the load scheduler, which will carry out load balancing scheduling, select one from a group of real servers, rewrite the destination address Virtual IP Address of the message to the address of the selected server, rewrite the destination port of the message to the corresponding port of the selected server, and * * send the message to the selected server. When the response message of the real server passes through the load scheduler, the source address and source port of the message are changed to Virtual IP Address and the corresponding port, and then the message is sent to the user. Berkeley's MagicRouter [8], Cisco's LocalDirector, Alteon's ACEDirector and F5's Big/IP all use network address translation methods. MagicRouter applies the fast message insertion technology on Linux 1.3.It makes the user process for load balancing to access the network device close to the core space, and reduces the processing overhead of context switching, but it is not thorough. It is only a prototype system studied and has not survived as a useful system. Cisco's LocalDirector, Alteon's ACEDirector and F5's Big/IP are very expensive commercial systems that support some TCP/UDP protocols and some have problems with ICMP processing.
IBM's TCP Router [9] uses the modified network address translation method to implement a scalable WEB server in the SP/2 system. TCP Router modifies the destination address of the request message and forwards it to the selected server, which can set the source address of the response message to the TCP Router address instead of its own address. The advantage of this method is that the response message can be returned directly to the client, but the downside is that the operating system kernel of each server needs to be modified. IBM's NetDispatcher [10] is the successor to TCP Router, which forwards the message to the server, which configures the address of the router on the non-ARP device. This approach is similar to VS/DR in LVS clusters, it is highly scalable, but a set of IBM SP/2 and NetDispatcher costs millions of dollars. Overall, IBM's technology is pretty good.
In ONE-IP [11] of Bell Labs, each server has an independent IP address, but it is configured with the same VIP address with IP Alias. The request is distributed by routing and broadcasting. After receiving the request, the server processes the request according to the VIP address and returns the result with VIP as the source address. This method is also to avoid rewriting response packets, but each server is configured with the same VIP address with IP Alias, which will lead to address conflicts and network failures in some operating systems. To distribute the request through broadcast, you also need to modify the source code of the server operating system to filter the message, so that only one server can process the broadcast request.
Microsoft's Windows NT load balancing service (Windows NT Load Balancing Service,WLBS) [12] was acquired by the acquisition of Valence Research at the end of 1998, which is the same as the local filtering method in ONE-IP. WLBS runs as a filter between the network card driver and the TCP/IP protocol stack to get the message whose destination address is VIP. Its filtering algorithm checks the source IP address and port number of the message to ensure that only one server sends the message to the upper layer for processing. However, when new nodes join and nodes fail, all servers need to negotiate a new filtering algorithm, which will cause all connections with Session to be interrupted. At the same time, WLBS requires that all servers have the same configuration, such as network card speed and processing power.
These are all the contents of the article "how to implement virtual servers in Linux server cluster system". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.