Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the process of Kafka processing requests?

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article focuses on "what is the process of Kafka processing requests". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn "what is the process of Kafka processing requests"?

Reactor mode

Before we talk about Kafka, let's talk about Reactor mode. Basically, as long as the underlying high-performance network communication is inseparable from Reactor mode. People like Netty and Redis all use Reactor mode.

As we used to learn network programming when the following code is very familiar, a new request, either in the current thread directly processed, or a new thread processing.

In the early days, there was no problem with such programming, but with the rapid development of the Internet, single-threaded processing could not be handled and computer resources could not be fully utilized.

When each request is processed by a new thread, the resource requirement is too high, and creating a thread is also a heavy operation.

Speaking of which, someone thought of it, so if you set up a thread pool, it's over, and you need Reactor.

Pooling technology can indeed alleviate the problem of resources, but the pool is limited, and a thread in the pool still has to wait for a connection, waiting for instructions. Now the Internet era has already broken through C10K.

Therefore, IO multiplexing is introduced, in which a thread monitors a pile of connections, waits for the arrival of one or more IO events synchronously, and then distributes the events to the corresponding Handler processing, which is called the Reactor pattern.

The development of network communication model is as follows > single thread = > multithread = > thread pool = > Reactor model

The Reactor model used by Kafka is as follows

Kafka Broker network communication model

To put it simply, an Acceptor (mainReactor) in Broker listens for the arrival of a new connection, and after establishing a new connection, a poll selects a Processor (subReactor) to manage the connection.

On the other hand, Processor listens for the connections it manages, reads and encapsulates the Request when the event arrives, and puts the Request into the shared request queue.

The IO thread pool then constantly pulls the request out of the queue and performs the real processing. After processing, the response is sent to the response queue of the corresponding Processor, and then the Processor returns the Response to the client.

There is only one Acceptor thread per listener, because it is only redistributed as a new connection, without too much logic, very lightweight, and one is sufficient.

Processor is called network thread in Kafka. The default network thread pool has three threads, and the corresponding parameter is num.network.threads. And it can increase or decrease dynamically according to the actual business.

There is also an IO thread pool, KafkaRequestHandlerPool, that performs real processing, with the corresponding parameter num.io.threads and the default value 8. After the IO thread finishes processing, the Response is put into the corresponding Processor, and the Processor returns the response to the client.

You can see the classic producer-consumer model between network threads and IO threads, whether it's a shared request queue for processing Request or the Response returned after IO processing.

What are the benefits of this? Producers and consumers are decoupled, and producers or consumers can be changed and expanded independently. And can balance the processing power of the two, for example, the consumption is too much, I add more IO threads.

If you look at other middleware source code, you will find that the producer-consumer model is so common that there is often a handwritten wave of producer-consumer in the interview questions.

Analysis of Network Communication Model at Source level

The Kafka network communication component is mainly composed of two parts: SocketServer and KafkaRequestHandlerPool.

SocketServer

You can see that SocketServer manages objects such as Acceptor threads, Processor threads and RequestChannel.

Data-plane and control-plane will analyze it later to see what RequestChannel is.

RequestChannel

The key properties and methods have been annotated in the following code, and you can see that this object is mainly managing Processor and acting as a transit point for transporting Request and Response.

Acceptor

Let's take a look at Acceptor next.

You can see that it inherits AbstractServerThread, and then we'll see what it run.

Let's see what accept (key) has done.

Quite simply, the standard selector processing takes the ready event, calls serverSocketChannel.accept () to get the socketChannel, and gives the socketChannel to the Processor selected by polling, which then handles the IO event. # # Processor next let's take a look at Processor, which is relatively more complex than Acceptor.

Let's take a look at three key members.

Let's take a look at the main processing logic.

You can see that Processor mainly encapsulates the underlying read event IO data into Request and stores it in the queue, then returns the Response crammed into the IO thread to the client, and handles the callback logic of Response.

# KafkaRequestHandlerPool

IO thread pool, the thread that actually processes the request.

Let's take a look at what IO threads have done.

Quite simply, the core is to constantly take the request from requestChannel and then call handle to process the request.

The handle method is located in the KafkaApis class, which can be understood as processing the request through switch, calling different handle according to the different apikey in the request header.

Let's take a look at the relatively simple process of dealing with LIST_OFFSETS, that is, handleListOffsetRequest, to complete a closed loop of a request.

I marked the call chain with a red arrow. Indicates that the request is stuffed to the corresponding Processor after the request is processed.

Finally, there is a more detailed overview diagram, which basically adds all the classes analyzed by the source code.

Request processing priority

It's time for data-plane and control-plane mentioned above to unveil. These two correspond to data class requests and control class requests.

Why do you need to divide into two types of requests? Just use key in the request to indicate whether the request is to read and write data or update metadata.

To put it simply, for example, if we want to delete a certain topic, we must want the topic to be deleted immediately, while producer keeps writing data to this topic, then this situation may be that our deletion request ranks Nth. Wait for the previous write request to be processed before it is time to delete the request. In fact, the previous requests to write to this topic are useless and consume resources in vain.

Or, during the Preferred Leader election, when producer set ack to all, the old leader was still waiting for follower to finish writing the data to report to him, who knows that follower has become the new leader, and the request to inform it that leader has been changed is blocked by a bunch of data type requests, the old leader is foolishly waiting until the timeout.

It is to address this situation that the community divides requests into two categories.

So how do you give priority to requests from the control class? Priority queue?

The community adopts two sets of Listener, one listener for the data type and one listener for the control class.

Corresponding to the network communication model we mentioned above, there are two sets in kafka! Kafka achieves request priority through two sets of listeners in disguise. After all, there must be many data type requests and fewer control classes, so it seems that control classes must be processed before most data types!

Circuitous tactics.

The difference between the control class and the data class is that there is only one Porcessor thread and the request queue is written to a length of 20.

At this point, I believe you have a deeper understanding of "what is the process of Kafka processing requests". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report