Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the server avalanche scenario

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "how to solve the server avalanche scene". In the actual case operation process, many people will encounter such difficulties. Next, let Xiaobian lead you to learn how to deal with these situations! I hope you can read carefully and learn something!

What is an application service avalanche?

avalanche problem

Distributed systems have such a problem, because of the instability of the network, determines that the availability of any service is not 100%. When the network is unstable, as a service provider, it may be dragged to death, causing service callers to block, and eventually triggering avalanche chain effects.

cache avalanche

When the cache server restarts or a large number of caches fail in a certain period of time, this will also bring great pressure to the backend system (such as DB), causing database backend failure, thus causing application server avalanche.

Several Scenarios Caused by Avalanche Effect

Traffic surge: such as abnormal traffic, user retry caused by system load increase;

Cache refresh: Assuming that A is the client side and B is the Server side, assuming that all requests from system A flow to system B, the request exceeds the carrying capacity of system B, which will cause system B to crash;

Bug program: code loop call logic problems, resources are not released caused by memory leaks and other issues;

Hardware failure: such as downtime, power failure in the computer room, optical fiber is cut off, etc.

Severe database bottlenecks, such as: long transactions, sql timeout, etc.

Thread synchronization waiting: The synchronous service invocation pattern is often used between systems, and core services and non-core services share a thread pool and message queue. If a core business thread calls a non-core thread, this non-core thread is completed by a third-party system. When a problem occurs in the third-party system itself, causing the core thread to block, it has been in a waiting state, and the inter-process call is time-out limited. Finally, this thread will be broken, which may also cause avalanche.

Cache avalanche solution

There are several types of cache invalidation:

1. The cache server hangs

2. Partial cache failure during peak period

3. Hotspot cache invalidation

Solution:

1. Avoid centralized cache failure. Different key settings have different timeout times.

2. Add mutex locks, control database requests, and rebuild caches.

3. Improve HA of cache, such as redis cluster.

Total solution for avalanche

Generally, there are three main solutions for service-dependent protection:

(1) fuse mode

This mode is mainly reference circuit blown, if a line voltage is too high, the fuse will blow to prevent fire. In our system, if a target service is invoked slowly or has a large number of timeouts, at this time, the invocation of the service is blown. For subsequent invocation requests, the target service is not invoked, and the resource is released quickly. If the target service condition improves, the call is resumed.

Focus on monitoring machine performance indicators

cpu(Load) cpu usage/load

memory

mysql monitors long transactions (closely combined with sql query timeout here, need to focus on monitoring)

sql timeout

Number of threads, etc.

In short, in addition to CPU, memory, thread count, focus on monitoring long transactions on the database side, sql timeout, etc., most of the avalanche scenarios that occur in the application server come from the performance bottleneck on the database side, which first causes a large number of bottlenecks on the database side, and finally drags down the application server to avalanche, and finally is a large area avalanche.

(2) Isolation mode

This pattern is similar to dividing system requests into islands by type. When one island is depleted of fire, it does not affect other islands.

For example, thread pools can be used to isolate resources for different types of requests. Each type of request does not affect each other. If a request thread of one type runs out of resources, it will directly return to subsequent requests of that type without invoking subsequent resources. This pattern can be used in many scenarios, such as breaking up a service, using separate servers to deploy important services, or the company's recent push for multiple centers.

(3) Current limiting mode

The fuse mode and isolation mode mentioned above belong to the fault tolerance processing mechanism after error, while the current limiting mode can be called prevention mode. The current limiting mode is mainly to set the highest QPS threshold for each type of request in advance, and if it is higher than the set threshold, the request will be returned directly without invoking subsequent resources. This model does not solve the problem of service dependency, but only the overall resource allocation problem of the system, because requests that are not limited may still cause avalanche effects.

fuse design

The design of the fuse mainly refers to the hystrix approach. Among them, the most important are three modules: fuse request judgment algorithm, fuse recovery mechanism, fuse alarm

(1) Fuse request judgment mechanism algorithm: use lockless circular queue count, each fuse maintains 10 buckets by default, one bucket every 1 second, each bucket records the status of success, failure, timeout and rejection of the request, default error exceeds 50% and more than 20 requests within 10 seconds for interrupt interception.

(2) fuse recovery: for fused requests, some requests are allowed to pass every 5 s, if the requests are healthy (RT).

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report