Analysis of services and load balancing developed by web 07/13 Update SLTechnology News&Howtos

Analysis of services and load balancing developed by web

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article focuses on "analyzing the services and load balancing developed by web". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "analyzing the services and load balancing developed by web".

Cause of the problem

In the stand-alone era, most of the traditional software is single / boulder architecture (Monolithic). People submit CODE to a code repository, which can lead to many problems, such as application expansion, difficulty in understanding and modification, limited extension, inability to scale on demand, and so on. How does a single architecture solve the problem of multi-person cooperation? Modularization, right, split by function, define programming interface (API) between modules, care about each other's function but not implementation.

With the development of the times, stand-alone programs encounter double bottlenecks of computing power and storage, and distributed architecture arises at the historic moment. The single application can easily complete the local function call through the function name (identity). In the distributed system, the service (RPC/RESTful API) assumes a similar role, but it is not enough to request the service by relying on the service name alone. The service name is only the identification of the service capability (service type). It also needs to indicate where the service is located on the network, while the service instance IP deployed in the cloud is dynamically allocated. Scaling up, failure and update make the problem more complex. Static configuration of service instances cannot adapt to new changes and requires more refined service governance capabilities. In order to solve or simplify this problem, service discovery is abstracted and provided as a basic capability, which attempts to make the request for network services as simple and transparent as calling local functions.

Service as a function (function) It is only when the service is closely related to the network that the term network service appears. The service provider publishes the service through the network, and the service consumer requests the service through the network. The distributed system breaks through the limitation of single computer computing power and storage. It improves the stability of the system and makes massive services with high concurrency and high availability possible, but it also increases the complexity of the software. New problems and challenges are introduced, such as software layering, load balancing, micro-services, service discovery / governance, distributed consistency and so on.

Service discovery

Service sub-service provider (Service Provider) and service consumer (Service Consumer), if you want to provide massive service capacity, a single service instance is obviously not enough. If you want to provide thousands of services, you need a place to record the mapping of the service name to the list of service instances. Therefore, it is necessary to introduce a new role: service mediation, which maintains a service registry (Service Registry). The registry can be understood as a service dictionary, key is the service name, and value is the list of service instances. Service registry is a bridge between service providers and service consumers. It maintains the latest network location and other information of service providers, and is also the core part of service discovery.

When the service is started, the service information is registered (put) into the service registry; when the service is terminated, its own service information is deleted (remove) from the service registry.

When requesting a service, the service consumer first goes to the service registry to query the list of service providers by name (get), and then selects a service instance from the list and requests the service from that instance.

From avenue to simplicity, this is the simplest service discovery model, and it is also the basic principle of service discovery. So far, everything seems to be OK, but in fact, there are still a few issues that have not been clarified.

Problem and solution

The first problem is that if the service is not stopped normally, but is dropped by the system kill, it will not have the opportunity to notify the service registry to delete its own service information, so that there will be an extra piece of information pointing to the invalid service instance in the registry, but the service consumer does not know what to do. The solution is simple: keepalive, the service provider sends keepalive messages to the service mediation on a regular basis (for example, every 10 seconds), the service intermediary updates the keepalive timestamp of the service instance after receiving the keepalive message, and the service mediation periodically checks the timestamp and removes the service instance from the registry if it expires.

The second question is, how to notify service consumers of changes in the list of service instances? There are only two methods, polling and pub-sub. Polling is when the consumer actively asks whether the service intermediary service list has changed, and if so, sends the new service list to the consumer. If there are too many consumers, the service intermediary will be under pressure to handle polling messages, and it can even become a bottleneck when there are many types of services and a large list of services. Pub-sub is a service intermediary that actively notifies service consumers, which is more timely than polling, but it has the disadvantage that it will occupy separate threads or connection resources.

The third question is, what if the service intermediary fails? So if we want to solve the single point problem, we usually use clusters to combat this vulnerability. there are many open source solutions for service registries, such as etcd/zookeeper/consul, which essentially use distributed consistent databases to store registry information, which not only solves read-write performance problems but also improves system stability and availability.

The fourth question is that if every time a service consumer uses a remote service, he needs to query the service intermediary for a list of instances and then request the service, which is too inefficient? There is also a lot of pressure on the service intermediary? Typically, the client caches the list of service instances so that multiple requests for a service with the same name do not have to repeat queries, reducing both latency and access pressure to the service mediation.

Fifth, the aforementioned keepalive has a gap, and if the service instance is not available during this interval, then the service consumer is still not aware of it, so it is still possible to send the request to a network remote machine that cannot provide the service, so it is naturally impossible to work. We cannot fundamentally put an end to this situation, and the system needs to tolerate this error, but it can also make some improvements, such as blocking after a failed request to an instance to avoid sending requests to the same invalid service instance multiple times.

The sixth question is, how do service consumers choose one of multiple service instances? How do you ensure that multiple service requests from the same service consumer are assigned to a fixed service instance (sometimes necessary)? This is actually a load balancing problem, there are a variety of strategies, such as rr, priority, such as weighted random, consistent hash.

Service Discovery Model

There are two main modes of service discovery: client discovery pattern (client-side discovery) and server discovery pattern (server-side discovery).

Client Discovery Mode

The client is responsible for querying the list of service instances and deciding which instance to request services from, that is, the load balancing strategy is implemented on the client. The model includes two parts: registration and discovery.

The service instance invokes the registration interface of the service mediation for instance registration, the service instance renews the service through keepalive, and the service intermediary removes the unavailable service instances through health check.

When a service consumer requests a service, query the service instance list from the service registry, which is a service database. In order to improve performance and reliability, the client usually caches the service list (the cache is used to ensure that the registry can continue to work after it has hung up). After getting the instance list, the client selects an instance to send the service request based on the load balancing policy.

Advantages

Directly, the client can flexibly implement the load balancing policy.

Decentralized, non-gateway type, effectively avoid single point bottleneck and reliability decline.

Service discovery integrates SDK directly into the client, which has a good degree of language integration, good program execution performance and convenient troubleshooting.

Shortcoming

The client is coupled to the service registry and needs to develop service discovery logic for each language and framework used by the service client.

This intrusive integration can lead to any change in service discovery that requires client application recompilation and deployment, and strong binding violates the principle of independence.

The online and offline service will have an impact on the caller, resulting in a temporary unavailability of the service.

Server-side discovery mode

Discovery: the service consumer sends the service request through the load balancer, which queries the service registry, selects a service instance, and forwards the request to the service instance.

Registration: service registration / logout can be consistent with the above client discovery model, or it can be accomplished through the built-in service registration and discovery mechanism of the deployment platform, that is, the containerized deployment platform (docker/k8s) can actively discover service instances and help service instances complete registration and logout.

Compared with the client discovery mode, the client using the server discovery mode does not save the service instance list locally, and the client does not do load balancing. This load balancer not only plays the role of service discovery, but also plays the role of gateway, so it is often called API gateway server.

Because the load balancer is centralized, it must also be a cluster, and a single instance is not enough to support high concurrent access. Service discovery and load balancing for the load balancer itself usually rely on DNS.

Http server, Nginx, Nginx Plus are the load balancers of this kind of server discovery mode.

Advantages

The service discovery is transparent to the service consumer, the service consumer is decoupled from the registry, and the update of the service discovery function is not aware of the client.

Service consumers only need to send requests to load balancers and do not need to develop service discovery logic SDK for each service consumer's programming language and framework.

Shortcoming

Because all requests are forwarded by the load balancer, the load balancer may become a new performance bottleneck.

The load balancer (service gateway) is centralized, and the central architecture has the hidden worry of stability.

Because the load balancer forwards the request, the RT is higher than the client direct connection mode.

Microservices and service discovery

Service Mesh service grid is a configurable infrastructure layer that serves micro-service applications and is designed to handle a large number of network-based inter-process communication between services.

Service Mesh service gateway decouples invocation and communication. In non-mesh, the perception of protocol and service discovery method needs to be applied. After using mesh, just call, and mesh controls the application data flow through the control plane.

Mesh service discovery is actually an upgraded version of the client discovery mode, based on sidecar and pilot implementation, Sidecars, that is, the data panel (Data Plane), is responsible for discovering the address list of target service instances and forwarding requests. Pilots, the control panel (Control Plane), is responsible for managing all service registration information in the service registry.

Service registration mode

One option is service instance self-registration, that is, self-registration mode. Another option is for other system components to manage the registration of service instances, namely the third-party registration pattern.

The self-registration pattern, as mentioned earlier, is simple enough that it does not require third-party components, but the disadvantage is that registration code must be implemented for each programming language and framework used in the service.

The third-party registration service instance will not complete the registration and logout on its own, it is the responsibility of another system component called Service Registrar, which will poll the deployment environment or track subscription events to perceive the changes of the service instance and help the service instance complete automatic registration and logout.

The main advantage of the Third-party registration pattern is that it decouples the service and the service registry. There is no need to implement service registration logic for every language and framework. Service instance registration is implemented by a dedicated service set. The disadvantage is that in addition to being built into the deployment environment, it is also a highly available system component that needs to be started and managed.

At this point, I believe you have a deeper understanding of "analyzing the services and load balancing developed by web". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.