How to analyze the current limiter and Guava realization 07/06 Update SLTechnology News&Howtos

How to analyze the current limiter and Guava realization

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

In this issue, the editor will bring you about how to analyze the current limiter and Guava. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

Current limit

The term current restriction is often used in computer networks and is defined as follows:

In computer networks, rate limiting is used to control the rate of traffic sent or received by a network interface controller and is used to prevent DoS attacks.

Prevent possible DOS attacks by controlling the rate at which network data is sent or received. In the actual software service process, current limit can also be used for the protection of API services. Because the computer resources (including CPU, memory, disk and network bandwidth, etc.) are limited, the QPS of the API service is also limited. The current-limiting tool is to limit the API access through the current-limiting algorithm to ensure that the service will not exceed the load pressure it can bear.

The main contents of this paper include:

Brief introduction and comparison of commonly used current limiting algorithms

Analysis on the implementation of current limiting tools in Guava package

Common current limiting algorithms

Citing the description of the Algorithms section on current limiting in wiki, the commonly used current limiting algorithms mainly include:

Token bucket- token bucket

Leaky bucket- leaky bucket

Fixed window counter- fixed window count

Sliding window log- sliding window log

Sliding window counter- sliding window count

In fact, the above methods can be simply divided into counting algorithm, leaky bucket algorithm and token bucket algorithm.

Counting current limiting algorithm

Whether the fixed window or the sliding window core is to count the request, the only difference lies in the processing of the counting time interval.

Implementation principle of fixed window counting

The idea of fixed window counting method is relatively simple, and only two parameters need to be determined: the counting period T and the maximum number of visits (calls) N in the cycle. When the request arrives, use the following process:

The implementation of fixed window counting is simple, and only need to record the start time of the previous cycle and the total number of visits in the cycle, almost no additional storage space is consumed.

Algorithm defect

The disadvantage of fixed window counting is also very obvious. During cycle switching, the total number of visits in the previous cycle is immediately set to 0, which may lead to traffic bursts during cycle switching, as shown in the following figure.

Sliding window counting on the basis of fixed window counting recording data, it is necessary to add a count array of length M to record the access data of each window during window sliding. An example of the process is as follows:

The simple description is as follows: artificially set the leaking bucket outflow speed and the total capacity of the leaky bucket, and judge whether the current leaky bucket capacity is full when the request arrives. If you are dissatisfied, you can store the request in the bucket, otherwise discard the request. The program takes out the request at a set rate for processing.

According to the description, the parameters need to be determined as leaky bucket outflow velocity r and leaky bucket capacity N. the process is as follows:

Algorithm characteristics

The token bucket algorithm can control the average speed of the request by setting the token put rate, and because the set capacity of N bucket caches the token, it can tolerate the burst of a certain traffic.

Algorithm comparison

Four algorithms are mentioned above, and this section mainly compares the four algorithms.

Table th {

Width: 100px

}

The algorithm name needs to determine the parameters to implement the introduction space complexity description fixed window count period T

The maximum number of visits in the cycle N uses the counter to accumulate the number of visits in the cycle, and the current limiting policy O (1) starts after the maximum number of visits is reached. Only when you need to record the number of visits in the cycle and the cycle start time switching, the number of visits may exceed the limit value sliding window counting period T

Maximum number of visits in a cycle N

The number of sliding windows M divides the time period into M small periods, records the number of visits in each small cycle respectively, and deletes the expired small period O (M) according to the time sliding, and needs to record the number of visits in each small cycle to solve the access burst problem when the cycle switching of the fixed window algorithm leaky bucket algorithm leaky bucket outflow speed r

When the leaky bucket capacity N service arrives, it is directly put into the leaky bucket. If the current capacity reaches N, the current limiting side rate is triggered. The program obtains the access request in the leaky bucket at the speed of r, knowing that the leaky bucket is empty O (1), and only needs to record the current capacity smooth flow in the leaky bucket to ensure that the speed of the service request to the service party is constant. Token bucket algorithm token generation speed r

The token bucket capacity N program adds tokens to the token bucket at the speed of r until the token bucket is full, requests a token from the token bucket when the request arrives, and if there is a token that meets the demand, the request is passed, otherwise it triggers the current limiting strategy O (1). It only needs to record the number of tokens in the current token bucket that can handle a certain amount of burst requests on the basis of current limit. Analysis and overview of the implementation of the current limiting tool in the Guava packet

The commonly used current limiting algorithms are briefly introduced above, and the current limiting tools in the Guava package can be used to limit the service flow in the process of JAVA software development. The class diagram of the current limiting tool in the Guava package is as follows:

The RateLimiter class takes the responsibility of both builder and Product.

Simple use example

In the actual coding process, the use of the GUAVA package flow limiting tool refers to the following code:

Final RateLimiter rateLimiter = RateLimiter.create (2.0); / / rate is "2 permits per second" void submitTasks (List tasks, Executor executor) {for (Runnable task: tasks) {rateLimiter.acquire (); / / may wait executor.execute (task);}}

The above code is extracted from the documentation of the RateLimiter class of the GUAVA package. First, use the create function to create a current limiter, specify that 2 tokens are generated per second, and use the acquire function or get tokens when you need to call the service.

Analysis of RateLimiter implementation

According to the code example, the abstract class RateLimiter, which assumes the responsibility of Product, has identified the API functions exposed to programmers, and the core functions that are mainly implemented are create and acquire. Therefore, the entrance is analyzed.

Create function analysis

The create function has two overloads, and different RateLimiter concrete implementation subclasses may be created according to different overloads. At present, the implementation subclasses that can be returned include SmoothBursty and SmoothWarmingUp, which are analyzed in detail below.

Acquire function analysis

The acquire function also has two overloaded classes, but the analysis process only needs to overload the function with shaping parameters, and the function without parameters is only a simple way to write acquire (1).

There are three main things done in the acquire (int permits) function:

Pre-allocate the number of authorizations. This function returns the time to wait, which may be 0.

Hibernate according to waiting time

Returns the time consumed to obtain authorization in seconds.

In the process of completing the above work, the RateLimiter class determines the process framework for obtaining authorization and implements some general methods, which will be called as implemented abstract methods. Developers can override abstract methods by specific subclasses according to different algorithm requirements. The invocation process is shown below:

Among them, the reserveEarliestAvailable method in the orange block needs to be implemented by a subclass. The following article takes this function as the core to analyze how the subclass of the RateLimiter class implements this method.

Subclass implementation analysis code analysis

As can be seen from the above class diagram, only SmoothRateLimiter is the direct subclass of the RateLimiter class in the GUAVA package, so the reserveEarliestAvailable function is used as the entry point to study its specific implementation. In the process of code implementation, you need to use the attributes in the SmoothRateLimiter class, and now list the attributes in the class:

The serial number attribute name attribute describes whether it is a static attribute 1storedPermits the number of tokens in the current token bucket No 2maxPermits the maximum number of tokens in the token bucket No 3stableIntervalMicros the interval between two tokens no 4nextFreeTicketMicros the next time there is a free token generated No

The reserveEarliestAvailable source code is as follows:

Overridefinal long reserveEarliestAvailable (int requiredPermits, long nowMicros) {resync (nowMicros); long returnValue = nextFreeTicketMicros; double storedPermitsToSpend = min (requiredPermits, this.storedPermits); double freshPermits = requiredPermits-storedPermitsToSpend; long waitMicros = storedPermitsToWaitTime (this.storedPermits, storedPermitsToSpend) + (long) (freshPermits * stableIntervalMicros); this.nextFreeTicketMicros = LongMath.saturatedAdd (nextFreeTicketMicros, waitMicros); this.storedPermits-= storedPermitsToSpend; return returnValue;}

As you can see from the function name of reserveEarliestAvailable, this function can return the earliest time the token is available. The input parameters required by the function are the required number of tokens requiredPermits, the current time nowMicros. After entering the function, first call the function named resync:

Void resync (long nowMicros) {/ / if nextFreeTicket is in the past, resync to now if (nowMicros > nextFreeTicketMicros) {double newPermits = (nowMicros-nextFreeTicketMicros) / coolDownIntervalMicros (); storedPermits = min (maxPermits, storedPermits + newPermits); nextFreeTicketMicros = nowMicros;}}

The function logic is relatively simple. First, get nextFreeTicketMicros. This value is explained in the above table, indicating that the next time a free token is generated. If the current time is less than or equal to nextFreeTicketMicros, it means that no new token can be generated at the current time, and then return directly. If the current moment is greater than nextFreeTicketMicros, the following tasks are done:

Calculates the number of newly generated tokens at the current time, involving a function called coolDownIntervalMicros, which returns the cooldown required to create a new token (Note: this function is not implemented in the current class, as described below)

Update the storedPermits attribute to take the minimum value between the generated token and the maximum storable token

Set the nextFreeTicketMicros property to the current time.

It can be seen that the main function of the resync function is to calculate the number of newly generated tokens and update the nextFreeTicketMicros attribute. The value of the nextFreeTicketMicros attribute is the largest of the original values of the current time and the nextFreeTicketMicros attribute.

After you finish calling the resync function, use the returnValue variable to record the most recent available time after updating the token (that is, the nextFreeTicketMicros property updated above).

Use the storedPermitsToSpend variable to record the number of tokens that need to be consumed to store, which is the minimum between the number of requested tokens and the number of tokens currently stored.

Use the freshPermits variable to record the number of tokens that need to be refreshed, which is realized by subtracting the previously calculated storedPermitsToSpend variable from the number of required tokens. It can be seen that the value of the freshPermits variable is the difference between the number of required tokens and the number of stored tokens, and 0 when the number of required tokens is less than the number of stored tokens.

The following is the core of the function, which calculates the waiting time, which is mainly divided into two parts: the time needed to consume the stored tokens and the time to generate new tokens. The storedPermitsToWaitTime function is used to calculate the time needed to consume the stored tokens, which is also an abstract function.

After calculating the wait time, the program updates the nextFreeTicketMicros property to add the most recently available time to the time it needs to wait.

Finally, when updating the number of tokens stored, subtract the number of tokens that need to be consumed.

Realize the logical characteristics

According to the above code analysis, it can be found that GUAVA's implementation of token buckets is slightly different from the theory. The number of tokens consumed by the current request will not affect the waiting time of this request, but will affect the waiting time of the next request.

According to the above analysis, when a request arrives, the latest available time returns the maximum value of the current time and the last available time calculated by the previous request, and the number of tokens required for this request updates the next most available time. In this design, if one token is generated per second and the first request requires 10 tokens, it can be returned immediately when the first request calls the acquire method, while the next request (no matter how many tokens are required) needs to wait 10 seconds after the first request, and the waiting time for the third request depends on how many tokens are required for the second time. This is also the meaning of "reserve" in the function name.

Abstract function analysis

In the above code analysis, there are two abstract functions coolDownIntervalMicros and storedPermitsToWaitTime, which are analyzed now.

CoolDownIntervalMicros function

The coolDownIntervalMicros function is already described in the code:

Returns the number of microseconds during cool down that we have to wait to get a new permit.

The main meaning is the time it takes to generate a token, and this function is mainly used to calculate the number of tokens that can be generated at the current time. According to the UML diagram above, the SmoothRateLimiter class has two subclasses, SmoothBursty and SmoothWarmingUp.

The implementation of the coolDownIntervalMicros function in the SmoothBursty class is as follows:

@ Overridedouble coolDownIntervalMicros () {return stableIntervalMicros;}

You can see that the implementation is very simple, simply returning the stableIntervalMicros property, that is, the time interval required to generate two tokens.

The implementation of the coolDownIntervalMicros function in the SmoothWarmingUp class is as follows:

@ Overridedouble coolDownIntervalMicros () {return warmupPeriodMicros / maxPermits;}

Where the maxPermits attribute appears earlier and represents the maximum capacity of the current token bucket. The warmupPeriodMicros attribute is unique to the SmoothWarmingUp class and represents the elapsed time for tokens from 0 to maxPermits in the token bucket, so warmupPeriodMicros / maxPermits represents the token generation interval before the number of tokens reaches maxPermits.

StoredPermitsToWaitTime function

The storedPermitsToWaitTime function is already described in the code:

Translates a specified portion of our currently stored permits which we want to spend/acquire,into a throttling time. Conceptually, this evaluates the integral of the underlying function weuse, for the range of [(storedPermits-permitsToTake), storedPermits]

Primarily represents the time it takes to consume tokens stored in a token bucket.

The implementation of the storedPermitsToWaitTime function in the SmoothBursty class is as follows:

@ Overridelong storedPermitsToWaitTime (double storedPermits, double permitsToTake) {return 0L;}

Return 0 directly, indicating that it does not take time to consume tokens.

The implementation of the storedPermitsToWaitTime function in the SmoothBursty class is as follows:

@ Overridelong storedPermitsToWaitTime (double storedPermits, double permitsToTake) {double availablePermitsAboveThreshold = storedPermits-thresholdPermits; long micros = 0; / / measuring the integral on the right part of the function (the climbing line) if (availablePermitsAboveThreshold > 0.0) {double permitsAboveThresholdToTake = min (availablePermitsAboveThreshold, permitsToTake); / / TODO (cpovirk): Figure out a good name for this variable. Double length = permitsToTime (availablePermitsAboveThreshold) + permitsToTime (availablePermitsAboveThreshold-permitsAboveThresholdToTake); micros = (long) (permitsAboveThresholdToTake * length / 2.0); permitsToTake-= permitsAboveThresholdToTake;} / / measuring the integral on the left part of the function (the horizontal line) micros + = (long) (stableIntervalMicros * permitsToTake); return micros;}

The implementation is more complex, and its core idea is that the calculation and consumption of the current storage token should be treated differently according to the preheating setting. It involves a new variable thresholdPermits, which is the token threshold. When the number of tokens currently stored is greater than this value, tokens in the storedPermits-thresholdPermits range need to have a preheating process (that is, the interval time between consuming each token is slowly reduced), while the number of 0~thresholdPermits is used to store tokens, and the time consumed by each token is a fixed value, that is, stableIntervalMicros.

The value of thresholdPermits needs to take into account the warm-up time and token generation speed, that is, thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;. It can be seen that the threshold is half of the number of tokens that can be generated in the preheating time, and the time for calculating tokens above the consumption threshold according to the comments can be converted into calculating the trapezoidal area of the preheating graph (actually an integral), which is not expanded in detail here.

Using this design can ensure that more tokens are stored in the token bucket when the last request interval is long. When these tokens are consumed, the initial token takes a long time, and the subsequent time is slowly shortened until it reaches the state of stableIntervalMicros, resulting in a warm-up effect.

Summary on the realization of GUAVA current limiter

The above analysis of the GUAVA current limiter implementation, which uses two abstract classes and two concrete subclasses to complete the current limiter implementation, in which the use of the top-level abstract class to assume the role of the creator, all the subclasses are transparent, reducing the number of classes that programmers need to know in the process of using the tool.

In the process of realizing the current limiter, it is based on the idea of token bucket, and the token bucket current limiter with preheater is added. The current-restricted thread uses its own SleepingStopwatch utility class, eventually using the Thread.sleep (ms, ns) method, and the lock held by the thread when sleeping using sleep will not be released, which should be noted here when multithreading programming.

Finally, the current limiter trigger algorithm of GUAVA uses the way of booking tokens, that is, the number of tokens required by the current request will not affect the waiting time of the current request, but will affect the waiting time of the next request.

The above is the editor for you to share how to carry out the current limiter and Guava analysis, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.