How to use Reactor to do something similar to Flink 07/09 Update SLTechnology News&Howtos

How to use Reactor to do something similar to Flink

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to use Reactor to complete an operation similar to Flink". The content in the article is simple and clear, and it is easy to learn and understand. please follow the editor's train of thought to study and learn "how to use Reactor to complete an operation similar to Flink".

I. background

Flink has great advantages in dealing with streaming tasks, in which operators such as windows can easily complete the aggregation task, but Flink is a set of independent services. If you want to use it, you need to send the data to kafka, use Flink to process and then send it to kafka, and then do business processing, the process is very complicated.

For example, if you want to implement window batch aggregation feature similar to Flink in business code, if it is cumbersome to write code by hand, and using Flink is too heavy, it is convenient to use window and buffer operators of responsive programming such as RxJava, Reactor, etc.

The responsive programming framework already has back pressure and rich operator support. Can you use the responsive programming framework to handle operations like Flink? the answer is yes.

Second, the process of realization

Flink encapsulates streaming very well, and there is almost no concern about thread pool, backlog, data loss and other issues when using Flink. However, if you use Reactor to achieve similar functions, you must have a good understanding of the operation principle of Reactor, and have been tested in different scenarios, otherwise it is easy to cause problems.

The core points in the implementation process are listed below:

1. Create Flux and send data separation

The examples given when getting started with Reactor are to assign data values at the same time when creating Flux, such as Flux.just, Flux.range, etc. You can create Flux from version 3.4.0, and then use Sinks to send data. There are two more confusing methods:

Sinks.many (). Multicast () if there is no subscriber, the received message is discarded directly

Sinks.many (). Unicast () if there is no subscriber, save the received message until the first subscriber subscribes

Sinks.many (). Replay () saves all messages no matter how many subscribers there are

In this example scenario, Sinks.many (). Unicast () is selected

Official document: https://projectreactor.io/docs/core/release/reference/#processors

2. Back pressure support

The object back pressure policy of the above method supports two types: BackpressureBuffer and BackpressureError. In this scenario, BackpressureBuffer must be selected and cache queue needs to be specified. The initialization method is as follows: Queues.get (queueSize). Get ()

There are two ways to submit data:

EmitNext specifies submission failure policy synchronous submission

TryEmitNext asynchronously submits, and returns the status of successful or failed submission

In this scenario, we do not want to lose data. We can customize the failure policy, submit the failure and try again indefinitely. Of course, we can also call the asynchronous method to try again.

Sinks.EmitFailureHandler ALWAYS_RETRY_HANDLER = (signalType, emitResult)-> emitResult.isFailure ()

After that, you can call Sinks.asFlux and happily use various operators.

3. Window function

Reactor supports two types of window aggregation functions:

Window class: returns Mono (Flux)

Buffer class: returns List

In this scenario, buffer can be used to meet the requirements. BufferTimeout (int maxSize, Duration maxTime) supports maximum number of operations and maximum waiting time operations. Keys operations in Flink can be implemented with groupBy and collectMap.

4. Consumer processing

Reactor sends data one by one after buffer. If publishOn or subscribeOn processing is used, the buffer operator will resend the data only after waiting for the downstream subscribe processing to finish request the new data. If the subscribe consumer takes a long time at this time, the data flow will block the buffer process, which is obviously not what we want.

The ideal operation is for consumers to operate in a thread pool that can be processed in parallel with multiple threads, and then block the buffer operator if the thread pool is full. The solution is to customize a thread pool, and of course, if the task is full of submit support blocking, you can implement it with custom RejectedExecutionHandler:

RejectedExecutionHandler executionHandler = (r, executor)-> {try {executor.getQueue (). Put (r);} catch (InterruptedException e) {Thread.currentThread (). Interrupt (); throw new RejectedExecutionException ("Producer thread interrupted", e);}}; new ThreadPoolExecutor (poolSize, poolSize, 0L, TimeUnit.MILLISECONDS, new SynchronousQueue (), executionHandler); III. Summary 1. Summarize the overall implementation process.

Submit task: submit data in both synchronous and asynchronous ways, and support multi-thread submission. Normally, the response is very fast. The synchronous method will block if the queue is full.

Rich operators deal with streaming data.

The data generated by the buffer operator is multithreaded: it is synchronously submitted to a separate consumer thread pool, and the thread pool is blocked when the task is full.

Consumer thread pool: blocking submission is supported to ensure that messages are not lost, while the queue length is set to 0 because there is already a queue.

Back pressure: after the consumer thread pool is blocked, it will back pressure on the buffer operator and back pressure on the buffer queue, which is full back pressure on the data submitter.

2. Comparison with Flink

The functions of the implemented Flink:

Rich operators that do not lose Flink

Support back pressure without losing data

Advantages: lightweight, can be used directly in business code

Disadvantages:

The internal execution process is complicated and easy to step on, so it is not as foolish as Flink.

No watermark function, which means only unordered data processing is supported.

There is no savepoint function, although we use back pressure to solve some of the problems, but after downtime, we will start to lose the data in the cache queue and consumer thread pool. The remedy is to add Java Hook function.

Only stand-alone is supported, which means that your cache queue cannot be set to infinity, and the size of the thread pool should be taken into account, and there is no flink globalWindow and other functions.

The impact on the upstream data source needs to be considered. The upstream of Flink is generally mq, which can be automatically accumulated when the amount of data is large. If the upstream of the solution in this paper is http and rpc calls, the blocking effect can not be ignored. The compensation scheme is to use an asynchronous method every time the data is submitted, and if it fails, it is submitted to the mq to buffer and consume the mq unlimited retry.

Thank you for reading, the above is the content of "how to use Reactor to complete operations similar to Flink". After the study of this article, I believe you have a deeper understanding of how to use Reactor to complete operations similar to Flink, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.