In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
How to resist the high concurrent traffic of double 11 is not very clear to many beginners. In order to help you solve this problem, the following editor will explain it in detail. People with this need can come and learn. I hope you can get something.
Service level agreement
The N Nines we often talk about is a description of SLA. The full name of SLA is Service Level Agreement, translated as service level agreement, also known as service level agreement, which indicates the level and quality of services provided by the public cloud.
For example, Aliyun promises that the availability of cluster services within a service cycle is not less than 99.99%. If it falls below this standard, cloud service companies need to compensate customers for their losses.
Is four nines good enough?
For Internet companies, SLA is a guarantee of the availability of websites or API services.
The more 9 means the longer the service is available throughout the year, the more reliable the service is. The service availability of 4 9s sounds high, but for actual business scenarios, this value may not be enough.
Let's do a simple calculation, assuming that a core link depends on 20 services, the strong dependency is not configured with any degradation, and the availability of these 20 services reaches 4 9s, or 99.99%.
Then the availability of this core link is only 99.99 to the 20th power = 99.8%. If there are 1 billion requests, there will be 3000000 failed requests. Ideally, there will still be 17 hours of service unavailable each year.
This is an ideal estimate, in the actual production environment, due to service release, downtime and other reasons, the situation will certainly be worse than this. For some sensitive businesses, such as finance, or industries that require high service stability, such as orders or payment business, this situation is not acceptable.
Avalanche effect of microservices
In addition to the pursuit of service availability, one of the unavoidable problems of micro-service architecture is the service avalanche. On a call link, the services of the micro-service architecture form a loose whole, which affects the whole body, and the service avalanche is a multi-level transmission process. First of all, a service provider is unavailable, and due to a large number of timeouts, the service caller is unavailable and transmitted on the whole link, which leads to the paralysis of the system.
How to limit current and downgrade?
As we analyzed above, in the scenario of large-scale micro-service architecture, to avoid service avalanches, to reduce downtime, and to improve service availability as much as possible. There are many ways to improve service availability, such as caching, pooling, asynchronization, load balancing, queuing and degraded circuit breakers.
Caching and queuing are used to increase the capacity of the system. Current limiting and degradation are concerned about the response of the system when it reaches the bottleneck of the system, and pay more attention to stability.
Caching and asynchronism improve the combat strength of the system, while current-limiting degradation focuses on defense. The specific implementation methods of current limiting and demotion can be summarized as eight-character maxim, which are current limiting, demotion, fuse and isolation respectively.
Current restriction and downgrade
As the name implies, traffic restriction sets the highest QPS threshold for each type of request in advance. If it is higher than the set threshold, the request will be returned directly, and subsequent resources will no longer be called. Current limit needs to be combined with pressure measurement to understand the highest water level of the system, which is also a means of stability guarantee which is most widely used in practical development. Downgrade is when the pressure on the server increases sharply, according to the current business situation and traffic, some services and pages are downgraded strategically, so as to release server resources to ensure the normal operation of core tasks. From the downgrade configuration mode, demotion can be divided into active demotion and automatic demotion. Active downgrade is configured in advance, and automatic downgrade is automatically degraded when the system fails, such as timeout or frequent failures. Among them, automatic downgrade can be divided into the following strategies:
Time-out degradation
Downgrade of the number of failures
Fault degradation
In the system design, downgrade is generally combined with the system configuration center, through the configuration center for push, the following is a typical demotion notification design.
Fuse isolation
If the invocation of a target service is slow or there is a large number of timeouts, the invocation of the service will be broken at this time. For subsequent invocation requests, the target service will no longer be called and will be returned directly to release resources quickly. Circuit breakers generally need to set different recovery strategies and resume calls if the target service improves. Service isolation is slightly different from the previous three. Our system usually provides more than one service, but these services are deployed on an instance or a physical machine at run time.
If the service resources are not isolated, once a service has a problem, the stability of the whole system will be affected! The purpose of service isolation is to avoid the interaction between services.
Generally speaking, isolation focuses on two aspects, one is where to isolate, and the other is which resources to isolate.
Where to isolate: a service call involves the service provider and the caller, and the resources we refer to are also the servers of both parties. Service isolation can usually start from two aspects: the provider and the caller.
What to isolate: in a broad sense, service isolation includes not only server resources, but also database sub-libraries, caching, indexes, etc. Here we only focus on service-level isolation.
The difference between demotion and fusing
Service downgrade and circuit breaker are similar in concept, through two scenarios, talk about my own understanding.
Circuit breaker, generally stop service: typical is the stock market circuit breaker, if the market is not controlled, directly closed, do not provide services, is a way to protect the market.
Downgrade, there is usually a backup plan: from Beijing to Jinan, the rain caused flight delays, I can take the high-speed rail, if the high-speed rail tickets are not available, I can also take the car or drive there.
The difference between the two: the degradation is generally active and predictable, and the circuit breaker is usually passive. After the demotion of service A, service B is usually replaced, and the circuit breaker is usually dealt with for the core link.
In actual development, the next step of a circuit breaker is usually a downgrade.
Design of commonly used current limiting algorithms
I just talked about the concept of current limit, so how to judge that the system has reached the set traffic threshold? This needs some current-limiting strategies to support, different current-limiting algorithms have different characteristics, the degree of smoothing is also different.
Counter method
The counter method is the simplest and easiest to implement in the current limiting algorithm. Assuming that an interface is limited to no more than 100 visits in a minute, maintain a counter, and each time a new request comes, the counter increases by one.
At this time, it is judged that if the value of the counter is less than the current limit value and the time interval of the last request is within one minute, the request is allowed to pass, otherwise the request is rejected, and if the time interval is exceeded, the counter is cleared to zero.
Public class CounterLimiter {
/ / initial time
Private static long startTime = System.currentTimeMillis ()
/ / initial count value
Private static final AtomicInteger ZERO = new AtomicInteger (0)
/ / time window limit
Private static final long interval = 10000
/ / limit through request
Private static int limit = 100
/ / request count
Private AtomicInteger requestCount = ZERO
/ / obtain current limit
Public boolean tryAcquire () {
Long now = System.currentTimeMillis ()
/ / in the time window
If (now
< startTime + interval) { //判断是否超过最大请求 if (requestCount.get() < limit) { requestCount.incrementAndGet(); return true; } return false; } else { //超时重置 startTime = now; requestCount = ZERO; return true; } } }计数器限流可以比较容易的应用在分布式环境中,用一个单点的存储来保存计数值,比如用 Redis,并且设置自动过期时间,这时候就可以统计整个集群的流量,并且进行限流。计数器方式的缺点是不能处理临界问题,或者说限流策略不够平滑。假设在限流临界点的前后,分别发送 100 个请求,实际上在计数器置 0 前后的极短时间里,处理了 200 个请求,这是一个瞬时的高峰,可能会超过系统的限制。计数器限流允许出现 2*permitsPerSecond 的突发流量,可以使用滑动窗口算法去优化,具体不展开。 漏桶算法 假设我们有一个固定容量的桶,桶底部可以漏水(忽略气压等,不是物理问题),并且这个漏水的速率可控的,那么我们可以通过这个桶来控制请求速度,也就是漏水的速度。我们不关心流进来的水,也就是外部请求有多少,桶满了之后,多余的水会溢出。 漏桶算法的示意图如下: 将算法中的水换成实际应用中的请求,可以看到漏桶算法从入口限制了请求的速度。使用漏桶算法,我们可以保证接口会以一个常速速率来处理请求,所以漏桶算法不会出现临界问题。 这里简单实现一下,也可以使用 Guava 的 SmoothWarmingUp 类,可以更好的控制漏桶算法: public class LeakyLimiter { //桶的容量 private int capacity; //漏水速度 private int ratePerMillSecond; //水量 private double water; //上次漏水时间 private long lastLeakTime; public LeakyLimiter(int capacity, int ratePerMillSecond) { this.capacity = capacity; this.ratePerMillSecond = ratePerMillSecond; this.water = 0; } //获取限流 public boolean tryAcquire() { //执行漏水,更新剩余水量 refresh(); //尝试加水,水满则拒绝 if (water + 1 >Capacity) {
Return false
}
Water = water + 1
Return true
}
Private void refresh () {
/ / current time
Long currentTime = System.currentTimeMillis ()
If (currentTime > lastLeakTime) {
/ / the time interval since the last leak
Long millisSinceLastLeak = currentTime-lastLeakTime
Long leaks = millisSinceLastLeak * ratePerMillSecond
/ / allow leakage
If (leaks > 0) {
/ / the light has been leaked
If (water 0) {
-- availableTokens
Return true
} else {
Return false
}
}
Private void refill () {
Long now = System.currentTimeMillis ()
If (now > lastRefillTimeStamp) {
Long elapsedTime = now-lastRefillTimeStamp
Int tokensToBeAdded = (int) (elapsedTime / 1000) * refillCountPerSecond)
If (tokensToBeAdded > 0) {
AvailableTokens = Math.min (capacity, availableTokens + tokensToBeAdded)
LastRefillTimeStamp = now
}
}
}
The main difference between the two algorithms is that the leaky bucket algorithm can forcibly limit the data transmission rate, while the token bucket algorithm can not only limit the average data transmission rate, but also allow a certain degree of burst transmission. In the token bucket algorithm, as long as there is a token in the token bucket, it is allowed to transmit data abruptly until the user-configured threshold is reached, so it is suitable for traffic with burst characteristics.
Comparison between leaky bucket and token bucket
The implementation of leaky bucket and token bucket algorithm can be the same, but the direction is opposite, and the current limiting effect is the same for the same parameters. The main difference is that the token bucket allows a certain degree of burst, and the main purpose of the leaky bucket is to smooth the inflow rate. Consider a critical scenario in which 100 Token accumulates in the token bucket, which can be passed in an instant. However, because the speed at which Token is generated in the next second is fixed, the token bucket allows instant permitsPerSecond traffic, but there is no 2*permitsPerSecond traffic, and the speed of the leaky bucket is always smooth.
Using RateLimiter to realize current limit
Google open source toolkit Guava provides a current-limiting tool class RateLimiter, which is based on token bucket algorithm to achieve traffic restrictions and is easy to use. RateLimiter uses a token bucket flow control algorithm. RateLimiter throws tokens into the bucket at a certain frequency, and the thread gets the token before it can execute. For example, if you want your application QPS not to exceed 1000, then when RateLimiter sets the rate of 1000, it will throw 1000 tokens into the bucket per second. Take a look at the instructions:
The API provided by RateLimter can be applied directly, where acquire will block, and semaphore Semphore,tryAcquire methods like JUC are non-blocking:
Public class RateLimiterTest {
Public static void main (String [] args) throws InterruptedException {
/ / 10 allowed, permitsPerSecond
RateLimiter limiter = RateLimiter.create (10)
For (int iTunes 1)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.