What are the common java current limiting algorithms 07/09 Update SLTechnology News&Howtos

What are the common java current limiting algorithms

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly explains "what are the common java current-limiting algorithms". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn what common java current-limiting algorithms are.

What is the current limit?

First of all, let's explain what is current restriction.

Flow restrictions are very common in daily life, such as going to some scenic spots, the number of tickets sold every day is limited, for example, 2000 tickets, that is, a maximum of 2000 people can go in and play every day.

The limit is "flow". The definition of "flow" varies in different scenarios, which can be requests per second, transactions per second, network traffic, and so on.

Generally speaking, current restriction refers to limiting the number of concurrent requests arriving at the system, so that the system can normally handle the requests of some users to ensure the stability of the system.

Current restriction will inevitably cause users' requests to slow down or be rejected, which will affect the user experience. Therefore, current limitation needs to strike a balance between user experience and system stability, that is, what we often call trade off.

By the way, current limit is also called flow control (flow control).

Why limit the current?

We mentioned earlier that current restriction is to ensure the stability of the system.

In daily business, there are scenarios such as flash sale activity, Singles Day promotion or breaking news. The traffic of users increases suddenly, and the processing capacity of the back-end service is limited. If the sudden traffic can not be handled properly, the back-end service can easily be destroyed.

Or abnormal traffic such as crawlers, our exposed services have to guard against our callers with the greatest malice. We don't know how the caller will invoke our service. Suppose a caller invokes your service crazily 24 hours a day with dozens of threads, and our service is done without doing anything. Even better is the DDos attack.

And for many third-party development platforms, not only to guard against abnormal traffic, but also for the fair use of resources, some interfaces are given to you free of charge, resources can not be occupied by you all the time, others have to adjust.

Description of traffic restrictions on Gaode open platform

Of course, if you add the money, it's easy to discuss.

The company also made a system before, when the SaaS version had not yet come out. So the system needs to be deployed to the customer side.

At that time, the boss asked that we need to give him a current-limited and degraded version. Not only the solution of the system is degraded, the core interface can only be called up to 20 times a day, but also need to limit the configuration and number of servers where the system resides, that is, to limit the number of CPU cores of deployed servers, and to limit the number of all deployed servers to prevent customer cluster deployment and improve system performance.

Of course, all this needs to be dynamically configured, because if you add money, it is easy to negotiate. The client never knew.

I guess the boss is waiting for the client to say that the system is a little slow. And then make a version 2.0? I asked our R & D department to work overtime for you.

In summary, the essence of current restriction is that the back-end processing capacity is limited and requests beyond the processing capacity need to be intercepted, or in order to balance the fair call of the client to the server resources and prevent some clients from starving to death.

Common current limiting algorithm

With regard to the current-limiting algorithm, I have given the corresponding diagrams and related pseudo-code, some people like to look at pictures, some people prefer to look at the code.

Counting current limit

The simplest current-limiting algorithm is to count and limit current. For example, the system can process 100 requests at the same time, save a counter, process a request, add one counter, and subtract one counter after a request has been processed.

Look at the value of the counter every time the request comes. If the threshold is exceeded, it will be rejected.

Very simple and rude, even if the value of the counter is in memory, it is a stand-alone current-limiting algorithm. In central storage, such as Redis, cluster machine access is a distributed flow-limiting algorithm.

The advantage is: simple and rough, stand-alone in Java can use Atomic and other atomic classes, distributed on Redis incr.

The disadvantage is that if we allow a threshold of 10, 000, and the counter is 0, when 10, 000 requests pour in in the first second, the burst of traffic cannot be held up. A slow increase in processing and a sudden influx are not the same for the program.

And the general purpose of current restriction is to limit the number of visits within a specified time interval, so there is an algorithm called fixed window.

Counter current limiting pseudo code to realize fixed window current limiting

Compared with the counting current limit, it mainly has the concept of a time window. The counter is reset every time the window passes. The rules are as follows:

The number of requests is less than the threshold, access is allowed and the counter is + 1; if the number of requests is greater than the threshold, access is denied; after this time window, the counter is cleared

Implementation of current limiting pseudo-code for fixed window

It looks perfect, but in fact it is flawed.

Fixed window critical problem

Suppose the system allows 100 requests per second, suppose the first time window is 0-1s, 100 requests pour in next time at 0.55s, zero is counted after 1 second time window, and then 100 requests pour in again at 1.05 s.

Although the count in the window did not exceed the threshold, the overall influx of 200 requests in 0.55s-1.05s 's 0.1s is actually unacceptable for a system with a threshold of 100amp s.

Fixed window

In order to solve this problem, sliding window current limit is introduced.

Sliding window current limit

Sliding window current limit can solve the problem of fixed window critical value, which can ensure that the threshold will not be exceeded in any time window.

Compared with the fixed window, the sliding window not only needs to introduce the counter, but also needs to record the time point of each request in the time window, so it takes up more memory.

The rules are as follows, assuming that the time window is 1 second:

Record the time of each request and count the number of requests in the time window in which the time of each request is pushed forward 1 second, and the data from 1 second ago can be deleted. When the statistical number of requests is less than the threshold, the time of the request is recorded and allowed to pass, otherwise rejected. Pseudo-code implementation of sliding window

However, neither sliding window nor fixed window can solve the attack of concentrated traffic in a short time.

We want to limit the flow, for example, a limit of 100 requests per second. We want to request one per 10ms so that our traffic processing is smooth, but it is difficult to control the frequency of requests in real scenarios. Therefore, it is possible to fill the threshold within the 5ms.

Of course, there are variants for this situation, such as setting multiple current-limiting rules. Not only limit 100 requests per second, but also set no more than 2 requests per 10ms.

By the way, this sliding window can be different from TCP's sliding window. In the sliding window of TCP, the receiver tells the sender how much "goods" he can receive, and then the sender controls the rate of transmission.

Next, let's talk about leaky buckets, which can solve the pain points of time window algorithms and make the traffic smoother.

Leaky bucket algorithm

As shown in the following figure, the water droplets continue to drip into the leaky bucket and flow out at a constant speed at the bottom. If the rate of droplets dripping in is greater than the rate of outflow, it will overflow when the storage exceeds the size of the bucket.

The rules are as follows:

When the request comes, put it in the bucket. The request volume in the bucket is full. The rejection request service takes the request from the bucket at a fixed speed to deal with the leaky bucket pseudo code implementation.

You can see that the droplets correspond to requests. Its characteristic is wide in and strict out, no matter how many requests and how fast the requests are, they all flow out at a fixed rate, corresponding to the service processing requests at a fixed rate. "he forced him to be strong, I Nick Yang."

See what comes to mind, is it a bit like the idea of message queue, cutting peaks and filling valleys? Generally speaking, leaky buckets are also implemented by queues. Requests that cannot be handled are queued up, and requests are rejected when the queue is full. What do you think when you see this? isn't that how thread pools are implemented?

After such a loophole filter, the request can flow out smoothly, which looks like a perfect one. In fact, its advantages and disadvantages are also disadvantages.

In the face of sudden requests, the processing speed of the service is the same as usual. In fact, this is not what we want. In the face of sudden traffic, we hope that while the system is stable, we can improve the user experience to process requests faster, instead of following the same rules as normal traffic (see, the sliding window said that the traffic was not smooth enough, but now it is too smooth and difficult to do).

Token buckets can be more "aggressive" in dealing with assault traffic.

Token bucket algorithm

In fact, the principle of the token bucket is similar to that of the leaky bucket, except that the leaky bucket flows out at a fixed speed, while the token bucket is stuffed into the bucket at a fixed speed, and then the request can only be passed after the token is obtained, and then processed by the server.

Of course, the size of the token bucket is also limited, assuming that when the tokens in the bucket are full, the tokens generated at a fixed speed will be discarded.

Rules:

If the number of tokens in the bucket exceeds the limit of the bucket at a fixed speed, the discarding request comes to ask for the token in the bucket first, and if the request is successful, it will be processed, on the contrary, the token bucket will be rejected.

What do you think of when you see this? Semaphore semaphores, semaphores can control the number of resources accessed at the same time. In fact, it is the same as our idea of holding tokens, one is to get semaphores, the other is to get tokens. It's just that the semaphore is returned when the semaphore is used up, and our tokens are not returned because the tokens are refilled regularly.

If we take a look at the pseudo-code implementation of the token bucket, we can see that the difference between the token bucket and the leaky bucket is that one is addition and the other is subtraction.

Token bucket pseudo code implementation

It can be seen that when the token bucket is dealing with sudden traffic, if there are 100 tokens in the bucket, then the 100 tokens can be taken away immediately, instead of consuming at a uniform speed like a leaky bucket. So token buckets perform better when dealing with sudden traffic.

Summary of current limiting algorithm

In fact, the algorithms described above are only the roughest implementation and the most essential idea of these algorithms, and there are actually many variants in engineering.

From above, it seems that leaky buckets and token buckets are much better than the time window algorithm, so what is the use of the time window algorithm? throw it away?

No, although leaky buckets and token buckets have better shaping effect on traffic and smoother traffic compared with token buckets, they also have their own disadvantages (some of which have been mentioned above).

Take the token bucket, for example, if you don't warm up, is there no token in the bucket when you go online? Didn't you just refuse to come here without a token? This is mistakenly killed, obviously the system does not have much load now.

For example, the requested access is actually random. If you put a token into each 20ms in the token bucket, and there is no token in the bucket initially, this request happens to have two requests in the first 20ms, and then there are no requests in the 20ms. In fact, from the 40ms point of view, there are only two requests, which should be granted, and one request is directly rejected. This may cause manslaughter of many requests, but if you look at the monitoring curve, it seems that the traffic is smooth and the peak value is well controlled.

Take the leaky bucket, for example, the request in the leaky bucket is temporarily stored in the bucket. In fact, this does not meet the requirements of low latency in Internet business.

So leaky buckets and token buckets are actually more suitable for blocking current-limiting scenarios, that is, I will wait without tokens, which will not be killed by mistake, while leaky buckets are waiting. It is more suitable for the current limit of background tasks. The current limit based on the time window is more suitable for time-sensitive scenes. Please tell me as soon as you can't ask for it. Thank you for all the flowers (pour Auntie a cappuccino. Why did I suddenly type this sentence?) .

Single machine current limit and distributed current limit

In essence, the difference between stand-alone current limit and distributed current limit lies in the location of the "threshold".

The algorithm mentioned above can be implemented directly on a single server, and often our services are deployed in clusters. Therefore, multiple machines are needed to provide current-limiting function together.

Like the counter or time window algorithm mentioned above, the counter can be stored in distributed Kmurv storage such as Tair or Redis.

For example, the time record of each request in a sliding window can be stored in Redis's zset, ZREMRANGEBYSCORE is used to delete data outside the time window, and then ZCARD is used to count.

Token buckets, for example, can also put the number of tokens into the Redis.

However, this way is equivalent to every request we need to go to Redis to determine whether it can be passed or not, there is a certain loss in performance, so there is an optimization point is "batch". For example, taking tokens each time is not one at a time, but one batch at a time, and another batch is not enough. This reduces the number of requests to Redis.

However, it should be noted that batch acquisition will lead to current-limiting errors within a certain range. For example, if you take 10 and do not use them at this time, and then use them again the next second, the total processing capacity of cluster machines may exceed the threshold at the same time.

In fact, the optimization point of "batch" is too common, whether it is the batch flushing of MySQL, the batch sending of Kafka messages or the high-performance sending of distributed ID, all contain the idea of "batch".

Of course, another idea of distributed current limit is to divide it equally. Assuming that there were 500 current limits on a single machine before, but now 5 units are deployed in the cluster, then let each machine continue to limit current 500, that is, to make a total current limit at the total entrance, and then each machine can achieve its own current limit.

The difficulty of current restriction

We can see that each current limit has a threshold, how to determine this threshold is a difficulty.

Fixed a large server may not be able to stand, set a small "mistakenly killed", there is no maximum use of resources, not good for the user experience.

All I can think of is to estimate the approximate threshold after the current limit is launched, and then instead of performing the real current limit operation, log recording is adopted to analyze the log to see the effect of current limit, and then adjust the threshold to calculate the total processing capacity of the cluster. And the processing capacity of each machine (to facilitate expansion and expansion).

Then the online traffic is replayed, the real current limiting effect is tested, and the final threshold is determined, and then online.

I also read an article by Uncle Mouse before, which said that in the case of automatic scaling, it is very difficult for us to adjust the current limiting threshold dynamically, so based on the idea of TCP congestion control, the health status of the server is determined according to the response time P90 or P99 of the request response in a time period. This algorithm is implemented in his Ease Gateway product, and students who are interested can search on their own.

In fact, the real business scenario is very complex, and there are many conditions and resources for flow restriction, and the requirements for each resource are not the same. So I am the king of the mouth.

Current limiting component

Generally speaking, we do not need to implement our own current-limiting algorithm to achieve the purpose of current-limiting, whether it is access layer current-limiting or fine-grained interface current-limiting, there are ready-made wheels, and its implementation also uses the current-limiting algorithm we mentioned above.

For example, RateLimiter, a current-limiting tool provided by Google Guava, is implemented based on token buckets and extends the algorithm to support preheating.

The uniform queuing and current limiting strategy in Ali's open source current limiting framework Sentinel adopts the leaky bucket algorithm.

The current limiting module limit_req_zone in Nginx adopts leaky bucket algorithm, as well as resty.limit.req library in OpenResty and so on.

The specific use is still very simple, interested students can search on their own, interested in the internal implementation of the students can look at the next source code to learn how to achieve the production level of current restrictions.

Thank you for your reading, the above is the content of "what are the common java current-limiting algorithms". After the study of this article, I believe you have a deeper understanding of what the common java current-limiting algorithms have, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.