Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the differences between Envoy Proxy and Netflix Hystrix

2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

What is the difference between Envoy Proxy and Netflix Hystrix? I believe many inexperienced people are at a loss about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

When we build a service architecture (service-oriented architecture, microservices, and other things), we all make a lot of calls on the network. The fragility of the network is well known, so we want the service to have a certain degree of redundancy in order to maintain service capacity during a system failure. An important part of this puzzle is a smart, application-savvy load balancer.

Circuit breakers (Note 3) are an important component in building large, reliable distributed systems, especially in cloud-based micro-service applications. With this component, we can short-circuit in the event of a continuous failure of the system. The circuit breaker itself is part of an intelligent, application-aware load balancer. Many people are using load balancing more or less. Let's look at the difference between Netflix Hystrix (Note 4) and Envoy Proxy (Note 5).

Fuse break

The value of open circuit (open circuit) is to limit the scope of fault influence. We hope to control, reduce or interrupt the communication between the failure system and the fault system, so as to reduce the load of the fault system and facilitate the recovery of the system. For example, there is a search service that invokes the recommendation service to provide personalized search results; but in the process, it is found that the recommendation service returns an error in many different calls, so we should probably stop calling for a while. Maybe the more we try, the more we will put pressure on the system and make the situation worse. After consideration, we decided to stop calling this service. This method is similar to the circuit breaker mechanism in our house. If a failure occurs, the fault part should be disconnected to protect the rest of the system. The circuit breaker mode forces our applications to face up to network problems-network calls are possible and do fail, thus preventing the system from avalanches. The key to this technology is to perceive the health of system components and determine whether traffic should be sent to a component.

Netflix OSS Hystrix

Netflix OSS released a circuit breaker component called Hystrix in 2012. Hystrix is a client-side Java library used to provide circuit breaker capabilities. Hystrix provides these features (Note 7)] (https://medium.com/netflix-techblog/making-the-netflix-api-more-resilient-a8ec62159c2d).

From Making the Netflix API more resilient (Note 7):

Custom fallback operations: in some cases, the service client library provides a fallback method to call, or in other cases, we can use local data to generate responses.

Fault suppression: in this case, the fallback method simply returns a null value, which is helpful if the data returned by this service is an optional item.

Quick failure: when the return data of the target service is required, Qin Guangxia, the above two methods can not solve this problem, can only return a 5xx response. This is very bad for user perception, but this means can protect the health of the server, so that the system can repair the failed service more quickly.

Circuit breakers can be triggered in a variety of ways (Note 7).

The request for the remote service timed out.

Full capacity of thread pool or task queue for a service

The client library used to invoke the service throws an exception.

Hystrix fuse process:

Netflix Hystrix allows very meticulous control of network interactions (Note 9). Hystrix allows fine configuration of dependent services based on the type of invocation. Assuming that most of our calls to the recommendation engine are read-only, then the configuration of the circuit breaker can be relaxed.

Another important fact is that the purpose of Hystrix is to break the circuit in time, so he regards the fault as consistent with the timeout. The point of failure that times out may also be on the customer's side. The setting of the outage threshold by Hystrix does not distinguish whether the fault is the responsibility of the server or the client.

In addition, it echoes the previous article: circuit breakers are really only one of the characteristics of intelligent, application-aware load balancing technology. In this case, "application awareness" means that his library runs in your application. In the Netflix ecosystem, client load balancers such as Hystrix and Netflix OSS Ribbon (Note 10) can also be paired.

The development of Service Mesh

The service architecture continues to evolve in a diversified direction, and we find it difficult or impractical to force services to use specific libraries, frameworks, or languages. As Service Mesh comes into view, we see that circuit breakers have infrastructure-level solutions that are independent of language and framework. Service Mesh can be defined as:

A decentralized application-level network infrastructure between services that provides security, resiliency, monitoring, and routing control capabilities.

Service Mesh can use different L7 (application-level) agents to play data-side roles, providing retry, timeout, circuit breaker, and other capabilities. In this article, we will look at how Envoy Proxy (Note 10) completes the circuit breaker task. Envoy Proxy is the default, out-of-the-box agent for Istio Service Mesh (Note 11), so the features described here also apply to Istio.

Envoy Proxy / Istio Service Mesh

Envoy Proxy uses the outage function as a subset of load balancing and health checks. Envoy divides and conquers the functions of routing (the cluster in which you choose to communicate) and the communication capabilities (and the identified backend). This design sets Envoy apart from other coarse-grained load balancers. Envoy can have many different "routes" that try to send traffic to the appropriate backend. These backends are called cluster, and each cluster can have its own load balancing configuration. Each cluster can also have its own passive health check (external inspection) configuration. Envoy has several configuration items for the fuse function, which can be described here one by one.

An external cluster is defined here:

"clusters": [{"name": "httpbin_service", "connect_timeout_ms": 5000, "type": "static", "lb_type": "round_robin", "hosts": [{"url": "tcp://172.17.0.2:8080"} {"url": "tcp://172.17.0.3:8080"}]

Here we will see that there is a cluster called httpbin_service, which uses the strategy of round_robin to balance the load between the two hosts. Next, add the Envoy circuit breaker configuration (Note 11).

Circuit breaker

"circuit_breakers": {"default": {"max_connections": 1, "max_pending_requests": 1, "max_retries": 3}

Here we are targeting the load of HTTP 1.x. We limit foreign connections to 1, and the maximum queued request is also 1. The maximum number of retries is also defined. In a sense, the practice of limiting connection pooling and the number of requests is similar to bulkhead (Note 12) and Netflix Hystrix. If the client application opens more than the limit of connections (this is a soft limit with some error), we will see that Envoy breaks the excess connections (Note 13) and records them in the statistical report.

External detection

As we can see above, Envoy's so-called circuit breaker function is actually more like connection pool control. To do this, Envoy does something called outlier detection (Note 14). Envoy keeps statistics on the operations of different endpoints in the load balancing pool. If excessive behavior is found, the endpoint is removed from the load balancing pool. We can take a look at this part of the configuration code:

"outlier_detection": {"consecutive_5xx": 1, "max_ejection_percent": 100, "interval_ms": 1000, "base_ejection_time_ms": 60000}

The implication of this configuration is that if a 5xx error occurs during communication, we should mark this host as unhealthy and remove it from the load balancing pool. We also configure max_ejection_percent to100, that is, when this happens, we are willing to eject all endpoints. This setting is closely related to the environment, so the content of the configuration should be carried out carefully according to local conditions. We want to route to as many hosts as possible so that we don't have to run the risk of partial or cascading failures. Envoy sets max_ejection_percent to 10 by default.

We also set the time base for eviction to 6000 milliseconds. The real time that a host is expelled from the load balancing pool is obtained by multiplying this cardinal number by the eviction time, so that less stable hosts can be isolated for longer.

Cluster panic

There is one more thing we need to know about Envoy external detection and load balancing. If too many hosts are expelled by this process, they will enter cluster panic mode (Note 15), in which case the proxy server will resend data to all hosts in defiance of the health marks of the load balancing pool. It's a great mechanism. In distributed systems, it is important to understand that sometimes something "in theory" may not be normal, and it is best to lower the requirements a little to prevent the impact of failures from expanding. On the other hand, this ratio can be controlled (more than 50% of evictions go into a state of panic by default), and the threshold can be raised or prohibited. When set to 0, his fuse behavior is similar to that of Netflix Hystrix.

Fine-tuning circuit breaker strategy

One advantage of the library approach is the application of an aware and fine-tuning circuit breaker strategy. Different calls to query, read, and write for the same cluster are demonstrated in the Hystrix document, and the Hystrix FAQ says:

In general, a single network route formed by a group of load balancing clusters uses many different HystrixCommands to serve many different types of functions.

Each HystrixCommands needs to set different throughput limits, timeouts, and fallback policies.

In Envoy, we can get the same precise circuit breaker policy through the route matching function (Note 16), which, combined with the cluster strategy (note 17), can specify what operations are performed on which clusters.

Istio circuit breaker

We can use the advanced configuration of Istio to provide circuit breaker capability. In Istio, we use the target policy (Note 17) to configure the load balancing and circuit breaker policies. The following is an example of a target strategy in which the circuit breaker is set:

Metadata: name: reviews-cb-policy namespace: defaultspec: destination: name: reviews labels: version: v1 circuitBreaker: simpleCb: maxConnections: 100 httpMaxRequests: 1000 httpMaxRequestsPerConnection: 10 httpConsecutiveErrors: 7 sleepWindow: 15m httpDetectionInterval: 5m

What happens after the circuit breaker?

What happens when the circuit breaker standard is reached? The fallback strategy in Hystrix is included in the library and can be choreographed. Hystrix allows us to do some subsequent operations, such as returning cached values, returning default values, and even invoking other services. We can also get clear information about the fault and make application-related decisions.

In Service Mesh, the cause of the failure is hidden because there is no support for built-in libraries. This does not mean that our application can not have fallback operations (whether it is a transmission or client error). I think any application, regardless of whether it uses a specific library framework or not, will try to fulfill its promise to the customer. If the scheduled action cannot be completed, there should be an elegant method of demotion. Fortunately, this operation does not have to be a framework. Most languages have built-in error and exception handling capabilities, and fallback strategies can be completed during exception handling.

Playback

Circuit breaker capacity is one of the functions of load balancing.

Hystrix only provides circuit breaker capability, and load balancing should be completed with Ribbon (or other client load balancer libraries).

Hystrix provides fallback capabilities in the form of client libraries, which is very powerful.

The load balancing implementation of Envoy includes the functions of external detection and circuit breaker.

The circuit breaker of Envoy is more like Hystrix Bulkhead, and the external inspection is more like Hystrix circuit breaker.

Envoy has many default, time-tested features, such as cluster panic.

Service Mesh lacks the ability to recover from failures, so it can only be done by applications.

After reading the above, have you mastered the difference between Envoy Proxy and Netflix Hystrix? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report