How to realize the Technical Analysis of kafka performance 07/02 Update SLTechnology News&Howtos

How to realize the Technical Analysis of kafka performance

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article shows you how to implement kafka performance technical analysis, concise and easy to understand, absolutely can make your eyes shine, through the detailed introduction of this article I hope you can gain something.

1. Read and write sequentially through disk, high efficiency, appendLog, compared to raid-5 7200rpm disk

sequence io 600M/s

random io 100kb/s

When kafka writes, it relies on the pagecache function of the underlying file system. Pagecache will use as much free memory as possible as disk cache. Write operations will be written to pageCache first, and the page will be marked as dirty. When reading operations occur, it will first look up from pageCache. When a page is missing, it will go to disk adjustment. When other applications apply for memory, the cost of reclaiming pageCache is very low.

Using the cache function of pageCache will reduce the memory dependence of our team JVM. JVM provides GC function for us. There are many problems in the scenario of relying on kafka high throughput in JVM and big data. If the heap manages the cache, then the JVM gc scans the heap space frequently, which brings a lot of overhead. If the heap is too large, the full gc brings a high cost; all In-Process Caches have the same PageCache in the OS. So by caching only in PageCache you can at least double the cache space. If Kafka restarts, all In-Process Caches will be disabled, and OS-managed PageCache will remain available.

2.sendFile

Traditional Network IO Model

① The operating system copies data from disk to pagecache in the kernel area

② The user program copies the pagecache of the kernel area to the user area cache

③ The user program copies the cache of the user area to the socket cache

④ The operating system copies the data in the socket cache to the buffer of the network card and sends it

Problem: four system calls and two context switches, the same data is copied multiple times in the cache, the efficiency is very low, the copying action can be completely carried out in the kernel, remove the 2 and 3 steps, sendfile is to complete this function

kafka was originally designed to copy data entirely through pageCache, minimizing disk reads and writes. If kafka's production and consumption match well, then data is completely stored in memory, which greatly improves the throughput of the cluster.

When the cluster has only write operations, the cluster has only write operations and no read operations. Send traffic of about 10M/s is generated by replicating between Partitions. From the comparison of recv and writ rates, it can be seen that disk writing uses Asynchronous+Batch, and the underlying OS may also optimize disk write order. When there is a Read Request coming in, it is divided into two situations. The first is to complete the data exchange in memory.

② The data that has been swiped from pageCache, at this time, you can see that most of the data is disk io.

TIPS

① kafka official does not recommend using log.flush.interval.messages and log. flush. interval. ms to force disk brushing, because high reliability should be guaranteed by replica copy, forced disk brushing has a great impact on system performance (production is 100000 and 60000)

Performance can be tuned by adjusting/proc/sys/vm/dirty_background_ratio and/proc/sys/vm/dirty_ratio.

a. Dirty page rate exceeding the first metric will initiate pdflush to start Flush Dirty PageCache.

b. Dirty page rates above the second metric block all writes for Flush.

c. Dirty_background_ratio can be reduced and dirty_ratio can be increased appropriately according to different business requirements.

(Production is 10 and 20)

2 partition

Partition is the foundation on which Kafka scales well and provides high concurrency and Replication.

Scalability aspects. First, Kafka allows Partitions to move freely between brokers within a cluster to balance out possible data skew problems. Second, Partition supports custom partition algorithms, such as routing all messages of the same Key to the same Partition. The Leader can also be migrated from In-Sync's Replica. Since all read and write requests for a Partition are handled only by the Leader, Kafka will try to distribute the Leader evenly among the nodes of the cluster to avoid excessive network traffic concentration.

Concurrency aspects. Any Partition can only be consumed by one Consumer in a Consumer Group at a time (conversely, a Consumer can consume multiple Partitions at the same time). Kafka's very simple Offset mechanism minimizes the interaction between Broker and Consumer, which makes Kafka not like other similar message queues, which reduce performance proportionally with the increase of the number of downstream Consumers. In addition, if multiple Consumers happen to consume data that is very similar in time sequence, a high PageCache hit rate can be achieved, so Kafka can support high concurrent read operations very efficiently, and in practice, it can basically reach the upper limit of a single NIC.

However, the number of Partitions is not the better, the more Partitions, the more the average number on each Broker. Considering the Broker (Network Failure, Full GC), it is necessary for the Controller to re-elect the Leader for all Partitions on all the Brokers that are down. Assuming that the election cost of each Partition is 10ms, if there are 500 Partitions on the Broker, then the reading and writing operations on the above Partitions will trigger LeaderNotAvailability Exception within 5 s of the election.

Further, if the dropped Broker is the Controller of the entire cluster, the first thing to do is to reassign a Broker as Controller. The newly appointed Controller needs to obtain Meta information of all Partitions from Zookeeper. It takes about 3-5ms to obtain each information, so if there are 10000 Partitions, this time will reach 30s-50s. And don't forget that this is just the time it takes to restart a Controller, and on top of that, add the time it takes to elect a Leader-_-!!!!

In addition, on the Broker side, Buffer mechanisms are used for both Producer and Consumer. The size of Buffer is uniformly configured, and the number is the same as the number of Partitions. If the number of Partitions is too large, the Buffer memory of Producer and Consumer will be too large.

tips

1. The number of Partitions should be pre-allocated as much as possible. Although Partitions can be dynamically increased in the later stage, there is a risk that the correspondence between Message Key and Partition may be destroyed.

2. The number of replicas should not be too large. If conditions permit, try to adjust the Partitions in the Replica set to different Racks.

3. Every effort is made to ensure that every time a Broker is shut down, a Clean Shutdown is possible, otherwise the problem is not only that it takes a long time to restore service, but also that data corruption or other weird issues may occur.

Producer

Kafka's development team said that in version 0.8, the entire Producer was rewritten in Java, and it is said that the performance has been greatly improved. I haven't tried it myself, so I won't compare the data here. The extended reading at the end of this article mentions a control group that I think is better, and interested students can try it.

In fact, most of the optimization methods adopted by the message system on the Producer side are relatively single, nothing more than zero into a whole, synchronous to asynchronous so several kinds.

The Kafka system supports MessageSet by default, which automatically groups multiple Messages into a Group and sends them out. After equal distribution, the RTT of each communication is lowered. And while organizing MessageSets, you can also reorder data from burst-streaming random writes to smoother linear writes.

In addition, it is important to note that Producer supports End-to-End compression. The data is compressed locally and transmitted over the network, and is generally not decompressed at the Broker (unless Deep-Iteration is specified) until the message is decompressed at the client after it has been consumed.

Of course, users can also choose to do compression and decompression work on the application layer (after all, Kafka currently supports limited compression algorithms, only GZIP and Snappy), but doing so will unexpectedly reduce efficiency!!! Kafka's End-to-End compression works best with MessageSet, which directly disconnects the two. As for the reason, it is actually very simple. A basic principle in the compression algorithm is "the more repeated data, the higher the compression ratio." Regardless of the content of the message body, regardless of the number of message bodies, in most cases a larger input data volume will achieve a better compression ratio.

However, Kafka's use of MessageSets also resulted in a certain degree of compromise in usability. Every time data is sent, Producer thinks it has been sent after send(), but in most cases the message is still in the MessageSet in memory and has not been sent to the network. At this time, if Producer hangs, data will be lost.

To solve this problem, Kafka's design in version 0.8 borrowed from the ack mechanism in the network. If you have high performance requirements and can allow Message loss to a certain extent, you can set request.required.acks=0 to turn off ack and send it at full speed. If you want to confirm the message sent, you need to set request. requiry.acks to 1 or-1. What's the difference between 1 and-1? Here again, I mentioned earlier about the number of replicas. If configured as 1, it means that the message only needs to be received and confirmed by the Leader, and other replicas can pull asynchronously without immediate confirmation, ensuring reliability without reducing efficiency. If it is set to-1, it means that the message can only return ack after Commit to all replicas in the ISR set of the Partition. The message sending will be safer, and the delay of the whole process will increase in proportion to the number of replicas. Here, corresponding optimization needs to be made according to different requirements.

tips

1. Don't over-configure Producer threads, especially when used in Mirror or Migration, which will exacerbate the target cluster Partition message out of order (if your application scenario is sensitive to message order).

2. 0.8 Version request.required.acks defaults to 0(same as 0.7).

Consumer

The design of the Consumer end is generally relatively conventional.

·Consumer Groups support both producer consumer and queue access modes.

· Consumer APIs are available in High and Low level. The former relies heavily on Zookeeper, so the performance is poor and not free, but it is super worry-free. The second type does not rely on Zookeeper services, and has better performance in terms of freedom and performance, but all exceptions (Leader migration, Offset crossing, Broker downtime, etc.) and Offset maintenance need to be handled by themselves.

You can watch the 0.9 Release. This group of goods has rewritten a set of consumers in Java. Merge the two APIs together and remove the dependency on Zookeeper. It is said that the performance has been greatly improved ~~

tips

It is highly recommended to use the Low level API. Although it is a bit cumbersome, only this API can customize the processing of Error data, especially when handling Broker exceptions or Corrupted Data caused by Unclean Shutdown. Otherwise, you cannot Skip and can only wait for the "bad message" to be rotated on the Broker. During this period, the Replica will always be unavailable.

The above content is how to implement kafka performance technical analysis, have you learned knowledge or skills? If you want to learn more skills or enrich your knowledge reserves, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.