How to optimize the batch processing interface 04/16 Update SLTechnology News&Howtos

How to optimize the batch processing interface

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article focuses on "how to optimize the batch processing interface", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to optimize the batch processing interface.

Background

Like batch import, there are a large number of batch processing interfaces in our system, such as batch acquisition of waybill, batch out of warehouse, batch printing, etc., and there are about 10 such interfaces.

These requests often have the following characteristics:

It takes a long time to process a single piece of data, generally speaking, it is above 200ms.

The batch of data is relatively large. For example, the largest page of our system is 1000 pieces of data, and the maximum batch that users can choose is 1000.

Overall, it takes a long time. For example, according to 200ms and 1000 pieces of data, it takes a total of 200s, which is too long.

These individual pieces of data cannot be merged together for processing.

Therefore, it is necessary for us to optimize the performance of the interface for batch processing.

But how to optimize it?

How to optimize

We know that there is a limit to the performance of a single machine. Batch requests like this will not only take up a lot of memory, but also take up a lot of CPU. All processing in the same process is bound to further increase the overall processing time. Therefore, for this batch of requests, the best way is to divide and conquer.

What is divide and rule?

Divide and conquer is useful in many scenarios, such as the batch import we mentioned in the previous article, which is generally divided into four parts:

Receive a request

Distribute request

Process the request

Summary request

So, in our batch processing process, how to apply the idea of divide and conquer?

First of all, we need to change a large number of requests into small requests one by one. Here, "change" refers to our backend to change, not the front end to call to modify, the front end still calls a large number of requests.

These small requests are then distributed to multiple machines for processing through some mechanism, for example, using Kafka as a dispenser.

Finally, the completion of each small request is counted, and when all the small requests are completed, the front end is notified that the entire request has been completed.

The notification here can go through the message module, and at the same time, after the modification of the small request above, you can return to the front end, and wait until all of them are completed here before asynchronously notifying.

All right, let's go straight to my architectural blueprint:

On the whole, it's quite complicated, let's go through each step:

Receive the request, and the front end requests the mass interface of the back end

Record the information of this batch processing request, such as assigning the request number, which user, which operation, how many, 0 successes, 0 failures, and so on

The status of the data in the batch update database is being processed by xxx and the original status is recorded. The batch update of mysql is used here, and the speed is very fast.

Send a large number of data to Kafka one by one, and Kafka acts as a distributor.

Return a response to the front end, indicating that the request has been received and is being processed, so that the result of the query on the interface is that these documents are being processed by xxx

Multiple service instances pull messages from Kafka to consume

Process each piece of data, such as checking permissions, parameters, dealing with complex business logic, etc., and write the result of mysql processing

Record the processing results of each piece of data to redis, such as the number of successful entries + 1, the number of failed entries + 1, and so on

When it is detected that all the data has been processed, that is, the total number of messages = the number of successful messages + the number of failed messages, send a message to the message service.

The message service sends a new notification to the front end: the XXX operation you just performed has been completed, X succeeded and X failed.

After receiving this notification, the front end checks that if it is still in this interface, it will automatically refresh the page, and so on, so that it can have some friendly interactions.

This is the overall batch request processing process, how, is it acceptable or not?

In addition, because there are so many batch processing interfaces in our system, if each interface is implemented in this way, there will be a lot of duplicate code.

Therefore, we can do a general batch interface, which can be implemented in the form of configuration metadata, which is in the format of {action: xx operation, targetStatus: xx processing}, so that all parts can be reused except for the middle process of processing messages.

All right, then let's review what scenarios this coquettish operation can be applied to.

Application scene

It takes a long time to process a single piece of data, but it is not necessary if it takes a very short time to process a single piece of data.

The batch of data is large, and it is not necessary if the batch is not large at one time.

The overall time is long, and the superposition of the above two factors is not necessary if the overall time is not long.

The scenario in which the database cannot be updated in batches is not necessary if the database can be updated in batches

Finally, let's take a look at what other improvement measures are available.

Improvement measures

In my opinion, there are mainly two kinds of improvement measures:

Each request may not always be in large quantities. For example, if the amount of data in a request is less than 10, is it faster to process it directly?

Not every batch scenario needs to be optimized. See the unnecessary scenarios using scenario analysis above.

At this point, I believe you have a deeper understanding of "how to optimize the batch processing interface". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.