Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the interview questions of forward agent and reverse agent in distributed architecture

2025-02-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains "what are the interview questions of forward agent and reverse agent in distributed architecture?" interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what are the interview questions for forward and reverse agents in distributed architecture?"

Introduction

After a series of questions about RPC, the interviewer confirmed that I did understand the theory of distributed architecture and the Inter-Service Communication Framework (RPC).

Then I began to ask about network-related knowledge, but not directly ask HTTP three-way handshake, TCP,UPD these, because these basics have been asked on one side, this side only revolves around distributed system-related network concepts, starting with the most basic agents.

1. Interviewer: after reading your resume, it is mentioned that you have used Nginx as an agent. How do you understand "forward agent" and "reverse agent"?

Problem analysis: even as a business developer, you often encounter problems that need to be configured with Nginx or other load balancers. For example, when you do a file upload service and find that the file is blocked after you go online, you can see that the error message is blocked by the Nginx agent, and Nginx limits the file size. How can you understand this problem at this time?

A:

Forward proxy: the proxy client requests the server, which is anonymous relative to the server. For example, the intranet uses forward proxy software, and all employees visit Baidu, but Baidu only sees an IP address, and Baidu does not know which employee visited it.

Reverse proxy: the proxy server provides services, which is anonymous relative to the user server. It also accesses www.baidu.com. All users enter a website or an IP, but there are thousands of servers behind Baidu, and you don't know which one you are visiting.

Interviewer: then why does the server use an agent? What's the advantage?

Problem analysis: the interviewer nodded, agreed with my understanding of the agent, and then asked me what is the use of using the agent to examine the principle of the agency technology.

A: for example, you can use Nginx as an agent, so you can open buffer,Nginx and buffer the Request request and Response before reading them completely. Just like a hotel waiter, all customers can tell a waiter what they want, not directly to the cook. A waiter receives 10 customers. Finally, the 20 dishes ordered are distributed to five cooks. The cook is like a server, so the cook can concentrate on cooking. The reception of this kind of thing can be done by Nginx agents. At the same time, according to the number of customers, you can increase or decrease the number of cooks, and distribute the task evenly to each chef. This is the role of load balancing, and Nginx can effectively distribute traffic.

In general:

Separate the IO from the server, break through the IO performance and improve the server throughput.

Control traffic distribution, manage service clusters, and play a role in load balancing.

Security and anonymity: by intercepting requests from the front-end server, the reverse proxy server can protect its identity and resist security attacks. It also ensures that multiple servers can be accessed from a single record locator or URL, regardless of the structure of the server network.

Interviewer: which load balancing algorithms do you know?

Problem analysis: when it comes to the load, the interviewer will continue to dig into the algorithm.

A:

Polling algorithm: the simple understanding is like the sexy lotus official online licensing, dealing cards from left to right, and finally each person gets the same number of cards. The polling algorithm is to distribute multiple user requests to the No. 1-10 machine in order. The aim is to put each machine under the same pressure. 100W queries theoretically allocate 10W queries to each machine.

Weighted polling algorithm: what is the difference between the polling algorithm and the weighted polling algorithm? Imagine that if the configuration of 10 servers is different, 8 machines are 8-core 32G, and the remaining two are 4-core 16G, if according to algorithm 1 polling, each machine shares 10W requests, the two low-equipped servers say I can't handle it. Your configuration is better than me, which is unfair. I want to have a downtime and have a rest. The weighted polling algorithm is to deal with this situation and set different weights for different machines. 10 machines with the same configuration share 10% of the traffic. If the performance of the machines is different, the weight of the two low-configuration machines will be reduced, and only 5% of the traffic will be borne. This will be fair, and those who can do more will work harder. Each machine is not idle to give full play to its maximum performance.

Random algorithm: the random algorithm is similar to the polling algorithm, which allows all requests to be randomly distributed to different machines. The more requests, the more requests are scattered on each machine.

Weighted random algorithm: similar to weighted polling algorithm, I won't say much here.

Minimum number of links algorithm: the minimum number of links is before the request for distribution, first look at the current 10 machines who are the most idle, who currently handles the least number of links will be assigned to whom, the final division of labor is also relatively fair.

Hash algorithm: the first five algorithms have a problem, that is, multiple requests from the same user may be assigned to different machines each time. What's wrong with this? If the server caches the user's Session, then each request from a different server needs to save the user's Session. The worst result is that the user requests 10 times, and all 10 servers cache the same user's Session, which is obviously a waste of server resources. At this time, the Hash algorithm appears. If the reader does not know Hash, it is recommended that Hash,hash (client:ip)% NMagneN is the number of servers under Google. As long as the user's IP remains the same, the result of the remaining Hash will not change. This ensures that every request of the same user will be on the same machine, and the IP here can also be the user's only ID.

Consistent Hash:hash (client:ip)% N in the formula above, what if N changes? For example, if a server is powered off, the result of Hash balance will change in the end, and all users will be assigned to which machines need to be recalculated. This is a disaster for services with Session status. Consistent Hash is to solve this problem.

Deep analysis

Load balancing is almost a necessary technology for major Internet companies to build systems. When learning distributed systems, beginners will ask me what is load balancing and why load balancing is used. Will adding an extra layer of load balancing slow service invocation? Study this chapter with these questions.

Let's first take a look at the website architecture without load balancing.

I don't know if you have found that if the server is a single server, it can be linked directly through the network. Why is load balancing a necessary technology for all major Internet companies to build systems? apart from personal websites, such as my blog or a single server, I believe no Internet company dares to do so. What if a machine goes down? Do you want users to wait for a while? Then it is estimated that the company is not far from going bankrupt.

Then how to solve the problem of single machine failure, and then look at the picture below.

As you can see, in the case of multiple servers, a layer of load balancing is added.

What is load balancing?

Load balancing (Load balancing) is a computer technology used to distribute load among multiple computers (computer clusters, network connections, CPU, disk drives, or other resources) to optimize resource usage, maximize throughput, minimize response time, and avoid overload. Using multiple server assemblies with load balancing instead of a single assembly can improve reliability through redundancy. Load balancing services are usually performed by dedicated software and hardware. The main role is to reasonably allocate a large number of jobs to multiple operating units for execution, in order to solve the problems of high concurrency and high availability in the Internet architecture.

-Wikipedia

Simply understand that the role of load balancer is to distribute traffic, distribute a large number of user requests to different servers to share the pressure. If there is a machine outage, the load balancer server will be responsible for automatically removing the faulty machine.

Commonly used load balancing framework

Nginx:NGINX | High Performance Load Balancer, Web Server, & Reverse Proxy, software from Russia, can not only be used as a load balancer, reverse proxy, but also an excellent web server. It is widely used, so it is also a knowledge point often asked in interviews. If English is good, you can refer to it.

LVS:The Linux Virtual Server Project-Linux Server Cluster for Load Balancing, an acronym for Linux Virtual Server, an open source server cluster system under Linux, was founded in May 1998.

HAProxy: http://www.haproxy.org/, a highly available http/TCP load balancer.

F5:F5 | Multi-Cloud Security and Application Delivery hardware load balancer.

All of the above are common load balancers, and which framework is used in the enterprise is not absolute, but mainly depends on the requirements of the system and the engineer's understanding of each framework.

A framework can survive must have its own advantages, there is no best, suitable for their own is the best.

For example, F5 is famous for its excellent performance and expensive price, ranging from hundreds of thousands to millions, with strong after-sales and technical support. I have only worked for many years to participate in the State Grid project. I also gave us a testing machine at that time. Later, I resigned and tried another company. The interviewer asked me what load balance was used by my previous project team. I said F5. The interviewer lamented that the national team was really rich and powerful. I didn't know any other solutions at that time. Did I say that F5 was strange? After the interview, I came home to understand that there were many solutions, Ali used LVS, but also used Nginx, Meituan initially used Nginx + LVS, and later independently developed MGW.

Some people will ask, how can a load balancer have so many algorithms? I'm so tired. Can I use so many algorithms in enterprise development? In real enterprise development, only one is generally used. if the responsible server does not have a Session with status, and there is no different machine configuration, then polling or random algorithms can be used, and engineers can choose the most suitable algorithm according to the actual situation.

Forward proxy & reverse proxy forward proxy

The forward agent in life, for example, if you want to travel to Russia, you need to go to the embassy to apply for a visa. The formalities are troublesome. You have no idea where to start. At this time, you want to find a travel agency. There is a special guide who can do it for you. You only need to provide information and wait at home to get a visa. You are the client, the Russian embassy is the server, and the guide is the agent.

A forward proxy is a proxy client, and the server does not know who the client is.

Reverse proxy

When the Internet was underdeveloped, we all called 10086 and asked for customer service. 31 provinces across the country have their own customer service centers, and each customer service center has hundreds of customer service brothers and sisters. We don't care who is assigned to you. We just need to connect 10086 and automatically assign customer service to you. This is reverse agency.

A reverse proxy is a proxy server, and the client does not know who the server is.

The reverse proxy server can act as a "traffic policeman", as shown above, in front of the back-end server (baidu), and distribute client requests on a set of servers in a manner that maximizes speed and capacity utilization, while ensuring that none of the servers are overloaded, which can degrade performance. If the server fails, the load balancer redirects traffic to another normal server.

At this point, I believe you have a deeper understanding of "what are the interview questions of forward agent and reverse agent in distributed architecture?" you might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report