How to understand Kafka performance 07/01 Update SLTechnology News&Howtos

How to understand Kafka performance

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article introduces the knowledge of "how to understand the performance of Kafka". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Guan Gong fought against Qin Qiong. "

65: Redis and Kafka are completely different middleware. Is there a comparison?

"Yes, so this article is not about" distributed cache selection "or" distributed middleware comparison ". We focus on the performance optimization of projects in these two different areas, and take a look at the common means of performance optimization for excellent projects, as well as the optimization methods for different scenarios.

Many people have learned a lot of things and learned a lot about the framework, but when they encounter practical problems, they often feel lack of knowledge. This means that the learned knowledge is not systematized and the methodology that can be effective is not abstracted from the concrete implementation.

One of the most important things to learn about open source projects is to sum up the methodology of the excellent implementation of different projects, and then deduce them into self-practice.

Start by sending a message

Brother Ma: rationality, objectivity and prudence are the characteristics and advantages of programmers, but many times we also need to be a little emotional and impulsive, which can help us to make decisions faster. "the pessimist is right, the optimist succeeds. "I hope everyone is an optimistic problem solver.

"Kafka performance panorama

From a highly abstract point of view, performance problems cannot escape the following three aspects:

The network

Magnetic disk

Complexity

For the network distributed queue such as Kafka, the network and disk are the top priority of optimization. For the abstract problems mentioned above, the solution is highly abstract and simple:

Concurrence

Compress

Batch

Caching

Arithmetic

Now that we know the questions and ideas, let's take a look at the roles in Kafka, and these roles are the points that can be optimized:

Producer

Broker

Consumer

Yes, all the questions, ideas, optimization points have been listed, we can be as detailed as possible, three directions can be refined, so that all the implementation is clear at a glance, even if we do not look at the implementation of Kafka, we can think of one or two points that can be optimized.

This is the way of thinking. Ask questions > list problem points > list optimization methods > list specific cut-in points > tradeoff and refine the implementation.

Now, you can also try to think about the optimization points and methods, do not have to be perfect, do not care whether it is good or not, think a little bit.

65 Brother: no, I'm stupid and lazy. You'd better tell me directly. I'm better at whoring for nothing.

"write sequentially"

65 Brother: other people's Redis is based on pure memory system, you kafka also read and write disk, can compare?

"Why is it slow to write disks?

We cannot only know the conclusion without knowing why. To answer this question, we have to go back to the operating system course we learned in school. Do you still have your textbook? Come on, turn to the chapter on disks, and let's review how disks work.

65 Brother: the ghost is still there. The book is gone before halfway through the course. If it hadn't been for the exam, I wouldn't have graduated by now.

"look at the classic big picture:

To complete a disk IO, you need to go through three steps: seek, rotation and data transfer.

The factors that affect disk IO performance also occur in the above three steps, so the main time spent is:

Seek time: Tseek is the time it takes to move the read / write head to the correct track. The shorter the seek time is, the faster the operation will be. Currently, the average seek time on disk is generally 3-15ms.

Rotation delay: Trotation is the time it takes for disk rotation to move the sector where the requested data is located below the read / write disk. The rotation delay depends on the speed of the disk and is usually expressed as 1 prime 2 of the time it takes for the disk to rotate for one week. For example, the average rotation delay of a disk for 7200rpm is about 60 times 1000 prime 7200 × 2 = 4.17ms, while a disk with a speed of 15000rpm has an average rotation delay of 2ms.

Data transfer time: Ttransfer refers to the time required to complete the transmission of the requested data, which depends on the data transfer rate, which is equal to the data size divided by the data transfer rate. At present, the data transfer rate of 300MB/s interface can be achieved by IDE/ATA, which can reach 133 MB. The data transmission time of the first two parts is usually much less than the consumption time of the first two parts. It can be ignored in simple calculation.

Therefore, if you omit seeking and rotation when writing to the disk, it can greatly improve the performance of reading and writing to the disk.

Kafka uses sequential file writing to improve disk write performance. Write files sequentially, basically reducing the number of disk seek and rotation. The head no longer dances on the track, but moves at full speed all the way.

Each partition in Kafka is an ordered, immutable message sequence, and new messages are constantly appended to the end of Partition. In Kafka, Partition is just a logical concept. Kafka divides Partition into multiple Segment, each Segment corresponds to a physical file, and Kafka appends the segment file, which is written sequentially.

65 Brother: why can Kafka use additional writing?

"it has something to do with the nature of Kafka. Let's take a look at Kafka and Redis. To put it bluntly, Kafka is a Queue and Redis is a HashMap. What's the difference between Queue and Map?

Queue is FIFO and the data is ordered; HashMap data is unordered and randomly read and written. The immutability and ordering of Kafka make it possible for Kafka to write files by appending write.

In fact, many data systems that meet the above characteristics can use additional writes to optimize disk performance. Typical AOF files for Redis, WAL (Write ahead log) mechanisms for various databases, and so on.

Therefore, if you clearly understand the characteristics of your business, you can optimize it pertinently.

Zero copy

65 Brother: , I was asked about this in the interview. Unfortunately, the answer was mediocre, alas.

"what is zero copy?

From the Kafka scenario, what system interactions are involved in the Kafka Consumer consumption of data stored on Broker disks, from reading Broker disks to network transmission to Consumer. Kafka Consumer consumes data from Broker, Broker reads Log, and sendfile is used. If you use the traditional IO model, the pseudo-code logic is as follows:

ReadFile (buffer) send (buffer)

As shown in the figure, if you use the traditional IO process, first read the network IO, and then write to the disk IO, you actually need to Copy the data four times.

The first time: read the disk file to the operating system kernel buffer

The second time: copy the data from the kernel buffer to the application's buffer

Step 3: copy the data from the application buffer to the socket network send buffer

The fourth time: copy the data of socket buffer to the network card, which will be transmitted by the network card.

65 Brother: Ah, is the operating system so stupid? copy came to copy.

"it is not that the operating system is stupid, the design of the operating system is that each application has its own user memory, user memory and kernel memory isolation, this is for program and system security considerations, otherwise each application memory is flying everywhere. It's okay to read and write at will.

However, there is also zero-copy technology, English-Zero-Copy. Zero copy is to reduce the number of copies of the above data as far as possible, so as to reduce the CPU overhead of the copy, reduce the number of context switching in the kernel state of the user state, and thus optimize the performance of data transmission.

There are three common zero-copy ideas:

Direct Icano: the data is transferred directly across the kernel and between the user address space and the Icano device. The kernel only performs the necessary virtual storage configuration and other auxiliary work.

Avoid copying data between kernel and user space: when applications do not need to access data, you can avoid copying data from kernel space to user space

Copy on write: the data does not need to be copied in advance, but partially copied when it needs to be modified.

Kafka uses mmap and sendfile to achieve zero copy. Corresponds to the MappedByteBuffer and FileChannel.transferTo of Java, respectively.

Use Java NIO to achieve zero copy, as follows:

FileChannel.transferTo ()

Under this model, the number of context switches is reduced to one. Specifically, the transferTo () method instructs the block device to read the data into the read buffer through the DMA engine. The buffer is then copied to another kernel buffer for temporary storage to the socket. Finally, the socket buffer is copied to the NIC buffer through DMA.

We reduced the number of copies from four to three, and only one of these copies involved CPU. We also reduced the number of context switches from four to two. This is a big improvement, but there is no query for zero copies. When running Linux kernel 2.4 and later and network interface cards that support collection operations, the latter can be implemented as a further optimization. As shown below.

According to the previous example, calling the transferTo () method causes the device to read data into the kernel read buffer through the DMA engine. However, when using the gather operation, there is no replication between the read buffer and the socket buffer. Instead, give NIC a pointer to the read buffer along with the offset and length, which are cleared by DMA. CPU absolutely does not participate in the replication buffer.

For more information about Zero-copy, you can read this article on Zero copy (Zero copy) and its application.

PageCache

When the producer production message arrives at Broker, Broker will use the pwrite () system call [FileChannel.write () API corresponding to Java NIO] to write data by offset. At this time, the data will be written to page cache first. When consumer consumes messages, Broker uses the sendfile () system call [corresponding to FileChannel.transferTo () API] to transfer data from page cache to broker's Socket buffer with zero copy, and then over the network.

The synchronization between leader and follower is the same as the process of consumer consuming data above.

The data in page cache is written back to disk with the scheduling of the flusher thread in the kernel and the call to sync () / fsync (). Even if the process crashes, you don't have to worry about data loss. In addition, if the message to be consumed by consumer is not in page cache, it will be read to disk, and some adjacent blocks will be pre-read and put into page cache to facilitate the next reading.

Therefore, if the production rate of Kafka producer is not much different from the consumption rate of consumer, then the entire production-consumption process can be completed almost exclusively by reading and writing broker page cache, with very little disk access.

Network model "

65 Brother: the Internet, as a Java programmer, is naturally Netty

"Yes, Netty is an excellent network framework in the field of JVM, which provides high-performance network services. when most Java programmers mention the network framework, the first thing that comes to mind is Netty. Dubbo, Avro-RPC and other excellent frameworks use Netty as the underlying network communication framework.

Kafka implements the network model to do RPC. The underlying layer is based on Java NIO and uses the same Reactor threading model as Netty.

The Reacotr model is mainly divided into three roles

Reactor: assigns IO events to the corresponding handler processing

Acceptor: handling client connection events

Handler: handling non-blocking tasks

In the traditional blocking IO model, each connection needs independent thread processing, when the number of concurrency is large, the number of created threads is large, which takes up resources; using the blocking IO model, if the current thread has no data to read, the thread will block on the read operation, resulting in a waste of resources.

Aiming at the two problems of the traditional blocking IO model, the Reactor model is based on the pooling idea, which avoids creating threads for each connection, and gives the business processing to the thread pool after the connection is completed; based on the IO reuse model, multiple connections share the same blocking object and do not have to wait for all connections. When there is new data to process, the operating system will inform the program that the thread will jump out of the blocking state and perform business logic processing.

Kafka implements multiplexing and processing thread pool based on Reactor model. The design is as follows:

It contains an acceptor thread to handle new connections, Acceptor has N Processor threads select and read socket requests, and N Handler threads process requests and correspondingly, that is, business logic.

By multiplexing the blocking of multiple Icano to the blocking of the same select, the system can process multiple client requests at the same time in the case of single thread. Its biggest advantage is that the system overhead is low, and there is no need to create new processes or threads, which reduces the resource overhead of the system.

Summary: Kafka Broker KafkaServer design is an excellent network architecture, if you want to know about Java network programming, or need to use this technology, students might as well read the source code. Follow-up "Code Brother" Kafka series of articles will also involve the interpretation of this source code.

Batch and compression

Sending a message to Broker by Kafka Producer is not the sending of a message. Students who have used Kafka should know that Producer has two important parameters: batch.size and linger.ms. These two parameters are related to the batch sending of Producer.

The execution process of Kafka Producer is shown in the following figure:

Messages are sent through the following processors in turn:

Serialize: keys and values are serialized based on the serializer passed. Excellent serialization can improve the efficiency of network transmission.

Partition: determines which partition of the topic to write the message to, following the murmur2 algorithm by default. Custom partitioning programs can also be passed to producers to control which partition messages should be written to.

Compress: by default, compression is not enabled in the Kafka producer. Compression can not only transfer faster from the producer to the agent, but also faster during replication. Compression helps improve throughput, reduce latency, and improve disk utilization.

As its name implies, Accumulate:Accumulate is a message accumulator. It maintains a Deque double-ended queue for each Partition, which stores the batch data to be sent, and the Accumulate accumulates the data to a certain amount, or sends the data in batches within a certain expiration time. Records are accumulated in the buffer of each partition of the topic. Records are grouped according to the producer batch size attribute. Each partition in the topic has a separate accumulator / buffer.

Group Send: batches of partitions in the record accumulator are grouped by the agent to which they are sent. Records in the batch are sent to the agent based on the batch.size and linger.ms properties. The record is sent by the producer according to two conditions. When a defined batch size or a defined delay time is reached.

Kafka supports multiple compression algorithms: lz4, snappy, gzip. Kafka 2.1.0 officially supports ZStandard-ZStandard is Facebook's open source compression algorithm designed to provide an ultra-high compression ratio (compression ratio), as detailed in zstd.

Producer, Broker and Consumer use the same compression algorithm. When producer writes data to Broker, Consumer reads data to Broker without even decompressing it, and finally decompresses it when Consumer Poll arrives at the message, which saves a lot of network and disk overhead.

Partition concurrency

The Topic of Kafka can be divided into multiple Partition, and each Paritition is similar to a queue to ensure that the data is orderly. For different Consumer concurrent consumption Paritition under the same Group, partition is actually the smallest unit for tuning Kafka parallelism, so it can be said that each additional Paritition increases consumption concurrency.

Kafka has an excellent partition allocation algorithm-StickyAssignor, which can ensure that the partition allocation is as balanced as possible, and the result of each redistribution is consistent with that of the last allocation as far as possible. In this way, the partition of the whole cluster is balanced as far as possible, and the processing of each Broker and Consumer is not too skewed.

65 Brother: is it better to have as many partitions as possible?

"of course not.

The more partitions need to open more file handles

In kafka's broker, each partition compares to a directory of the file system. In the kafka data log file directory, each log data segment is assigned two files, an index file and a data file. Therefore, as the number of partition increases, the number of file handles required increases dramatically, and you need to adjust the number of file handles that the operating system allows to open if necessary.

The more memory the client / server needs to use

The client producer has a parameter batch.size, which defaults to 16KB. It caches messages for each partition, and once it is full, it packages and sends messages in batches. It looks like a design that can improve performance. Obviously, however, because this parameter is partition-level, the more partitions there are, the more memory will be required for this part of the cache.

Reduce high availability

The more partitions, the more partitions are allocated on each Broker, and when a Broker outage occurs, the recovery time will be long.

File structure Kafka messages are classified in Topic units, and each Topic is independent of each other and does not affect each other. Each Topic can be divided into one or more partitions. Each partition has a log file that records message data.

Kafka each partition log is physically divided into multiple Segment by size.

Segment file consists of two parts, index file and data file, which correspond to each other and appear in pairs. The suffixes ".index" and ".log" are represented as segment index files and data files respectively.

Segment file naming convention: the first segment of the partion global starts at 0, and each subsequent segment file is named the offset value of the last message in the previous segment file. Numeric values up to 64-bit long size, 19-digit character length, no numbers filled with 0.

Index uses sparse indexes, so that the size of each index file is limited, and Kafka uses the mmap way to map the index file directly to memory, so that the operation of index does not need to operate disk IO. The Java implementation of mmap corresponds to MappedByteBuffer.

Notes: mmap is a method of memory mapping files. A file or other object is mapped to the address space of the process to realize the one-to-one mapping relationship between the file disk address and a virtual address in the process virtual address space. After the implementation of such a mapping relationship, the process can use a pointer to read and write this section of memory, and the system will automatically write back dirty pages to the corresponding file disk, that is, to complete the operation of the file without having to call system call functions such as read,write. On the contrary, the modification of this area in kernel space also directly reflects user space, so that file sharing between different processes can be realized.

"Kafka makes full use of dichotomy to find the message location of the corresponding offset:

Find. Log and. Index of segment smaller than offset according to dichotomy.

Subtract the offset in the file name from the target offset to get the offset of the message in this segment.

Again, use dichotomy to find the corresponding index in the index file.

Go to the log file and look for it sequentially until you find the message corresponding to offset.

That's all for "how to understand Kafka performance". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.