Kafka performance tuning 04/21 Update SLTechnology News&Howtos

Kafka performance tuning

2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

The configuration of Kafka is detailed and complex, and a lot of information is needed for comprehensive performance tuning. Here are only a few common points that I have walked through in my daily work and experience to optimize the kafka cluster.

Optimization of 1.JVM

Java related systems naturally can not be separated from the optimization of JVM. The first thing that comes to mind must be the adjustment of Heap Size.

Vim bin/kafka-server-start.sh adjusts the value of KAFKA_HEAP_OPTS= "- Xmx16G-Xms16G"

Recommended configuration: generally, the size of the HEAP SIZE does not exceed 50% of the host memory.

two。 Network and ios operation thread configuration optimization:

# maximum number of threads for broker processing messages num.network.threads=9# broker number of threads for processing disk IO num.io.threads=16

Recommended configuration:

Num.network.threads mainly deals with network io, read and write buffer data, basically no io wait, configure the number of threads for cpu core plus 1.

Num.io.threads is mainly used for disk io operations, and there may be some io waiting during peak periods, so the configuration needs to be larger. The number of configured threads is 2 times the number of cpu cores, and the maximum is no more than 3 times.

3.socket server acceptable data size (to prevent OOM exceptions):

Socket.request.max.bytes=2147483600

Recommended configuration:

Adjust the size of your own business packets appropriately. Here, the value is of type int, but the range of values limited by the type of java int cannot be too large:

The value range of java int is (- 2147483648 / 2147483647). It occupies 4 bytes (- 2 to the power of 31 to the power of-1), which cannot be exceeded. An error will be reported after exceeding it: org.apache.kafka.common.config.ConfigException: Invalid value 8589934592 for configuration socket.request.max.bytes: Not a number of type INT.

4.log data file flushing strategy

# whenever 10000 messages are written by producer, the data will be brushed to disk log.flush.interval.messages=10000# every 1 second, and the data will be brushed to disk log.flush.interval.ms=1000

Recommended configuration:

In order to greatly improve the write throughput of producer, it is necessary to write files in bulk on a regular basis. Generally, there is no need to change. If the amount of data in topic is small, you can consider reducing log.flush.interval.ms and log.flush.interval.messages to force data to be overwritten, so as to reduce the inconsistencies that may be caused by cached data not being written to disk. It is recommended to configure message 10000 with an interval of 1 s.

5. Log retention policy configuration

# Log retention duration log.retention.hours=72# segment file configuration log.segment.bytes=1073741824

Recommended configuration:

It is recommended to keep the log for three days or shorter; configure 1GB for segment files to quickly reclaim disk space, and restart kafka load faster (kafka starts by scanning all data files in a single-thread directory (log.dir)). If the file is too small, the number of files is larger.

6.replica replication configuration

Num.replica.fetchers=3replica.fetch.min.bytes=1replica.fetch.max.bytes=5242880

Recommended configuration:

Each follow pulls messages from leader to synchronize data. Follow synchronization performance is determined by these parameters, which are:

Number of pull threads (num.replica.fetchers): more fetcher configurations can improve the concurrency of follower. If leader holds more requests per unit time, the corresponding load will increase, which needs to be weighed according to the hardware resources of the machine. It is recommended to scale it up appropriately.

Minimum number of bytes (replica.fetch.min.bytes): generally, it does not need to be changed. The default value is fine.

Maximum number of bytes (replica.fetch.max.bytes): default is 1MB. This value is too small. 5m is recommended, which can be adjusted according to business conditions.

Maximum waiting time (replica.fetch.wait.max.ms): follow pull frequency is too high, leader will backlog a large number of invalid requests, unable to synchronize data, causing cpu to soar. Use it carefully when configuring. Default values are recommended. There is no need for configuration.

7. Number of partitions configuration

Num.partitions=5

Recommended configuration:

The default number of partition is 1, which is used by default if the number of partition is not specified when the topic is created. The number of Partition will also directly affect the throughput performance of the Kafka cluster. If the configuration is too small, it will affect the consumption performance. It is recommended to change it to 5.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.