How to set the number of partitions in kafka 04/25 Update SLTechnology News&Howtos

How to set the number of partitions in kafka

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

In this issue, Xiaobian will bring you about how to set the number of partitions in kafka. The article is rich in content and analyzes and narrates from a professional perspective. After reading this article, I hope you can gain something.

More partitions can provide higher throughput

First we need to understand the fact that in kafka, a single petition is the smallest unit of kafka parallel operations. On the producer and broker side, writing data to each partition can be completely parallelized, and in this case, the throughput of the system can be improved by increasing the utilization of hardware resources, such as compressing data. In the consumer segment, kafka only allows data from a single partition to be consumed by a consumer thread. Thus, on the consumer side, the consumer parallelism within each Consumer Group depends entirely on the number of partitions consumed. To sum up, in general, in a Kafka cluster, the more partitions, the greater the throughput that can be reached.

We can roughly calculate the number of partitions in a kafka cluster by throughput. Assuming that for a single partition, the achievable throughput of the producer is p, the achievable throughput of the Consumer is c, and the desired target throughput is t, then the number of partitions required for clustering is at least max(t/p,t/c). On the producer side, the throughput size of a single partition is affected by configuration parameters such as batch size, data compression method, acknowledgement type (synchronous/asynchronous), replication factor, etc. After testing, on the producer side, the throughput of a single partition is usually around 10MB/s. On the consumer side, the throughput of a single partition depends on the application logic processing speed of each message on the consumer side. Therefore, we need to measure throughput on the consumer side.

While we can add to the number of partitions over time, we need to focus on messages generated based on keys. When a producer writes a key-based message to kafka, kafka uses the hash value of the key to determine which specific partition the message needs to be written to. With this scheme, kafka ensures that data with the same key value can be written to the same partition. Kafka's ability to do this is critical for some applications, such as consuming all messages for the same key in order. If the number of partitions changes, then the above orderliness guarantee no longer exists. To avoid this, the usual solution is to allocate more partitions to meet future needs. Typically, we need to design the number of partitions for kafka based on target throughput for the next 1 to 2 years.

At first, we can allocate a smaller number of brokers to the kafka cluster based on the current traffic throughput, and over time we can add more brokers to the cluster, and then move the appropriate proportion of partitions online to the newly added brokers. With such an approach, we can maintain scalability of service throughput while satisfying various application scenarios (including key-message-based scenarios).

There are other factors to consider when designing the number of partitions besides throughput. As we will see later, for some application scenarios, clustering has a negative impact on partitions.

The more partitions need to open more file handles

In kafka's broker, each partition is mapped to a directory on the file system. In kafka's data log file directory, each log segment is allocated two files, an index file and a data file. In the current version of kafka, each broker opens an index file handle and a data file handle for each log segment file. Therefore, as partitions increase, the underlying operating system needs to configure a higher limit on the number of file handles. This is more of a configuration issue. We've seen more than 30,000 open file handles per broker in a production Kafka cluster.

More partitioning results in higher unavailability

Kafka uses multi-copy replication technology to achieve high availability and stability of kafka clusters. Each partition will have multiple copies of the data, each copy existing in a different broker. Of all the copies of data, one copy of data is a Leader and the other copies of data are followers. Within the kafka cluster, all data replicas are managed in an automated manner and ensure that all data replicas are synchronized. Requests to partition, whether from producer or consumer, are processed through the broker where the copy of the leader data is located. When a broker fails, all partitions in the broker for the leader data copy will become temporarily unavailable. Kafka will automatically select a leader among the other copies of data to receive client requests. This process is done automatically by the kafka controller broker, which mainly reads and modifies some metadata information of the affected partition from Zookeeper. In the current implementation of kafka, all operations on zookeeper are done by the kafka controller (seriously).

Under normal circumstances, when a broker stops service in a planned way, the controller will remove all leaders on the broker one by one before the service stops. Since the movement time of a single leader takes only a few milliseconds, planned service outages at the customer level result in system unavailability for only a small window of time. (Note: During planned downtime, only one leader will be transferred per time window, and all other leaders will be available.)

However, when the broker goes out of service unexpectedly (for example, kill -9 mode), the system's unavailability window will depend on the number of partitions affected. Suppose there are 2000 partitions in a 2-node kafka cluster, and each partition has 2 copies of data. When one of the brokers goes down unexpectedly, all 1000 partitions become unavailable simultaneously. Assuming that the recovery time for each partition is 5ms, the recovery time for 1000 partitions will take 5 seconds. Therefore, in this case, the user will observe that the system has a 5-second window of unavailability.

An even more unfortunate situation occurs when the down broker happens to be the controller node. In this case, the election process for the new leader node does not start until the controller node reverts to the new broker. Error recovery for the Controller node will occur automatically, but the new controller node needs to read metadata information for each partition from the zookeeper for initialization data. For example, assuming that a kafka cluster has 10,000 partitions and each partition takes approximately 2ms to recover metadata from zookeeper, controller recovery will add approximately 20 seconds to the unavailability window.

Unplanned downtime is rare in general. If system availability cannot tolerate these rare scenarios, it is best to limit the number of partitions per broker to 2,000 to 4,000 and the number of partitions per kafka cluster to 10,000.

More partitions may increase end-to-end latency

Kafka end-to-end latency is defined as the time it takes for a producer to publish a message and for a consumer to receive it. That is, the time the consumer receives the message minus the time the producer publishes the message. Kafka only exposes messages to consumers after they are submitted. For example, messages are not exposed until all in-sync replica lists are replicated synchronously. Therefore, the time taken for in-sync copy replication will be the most significant part of kafka's end-to-end latency. By default, when each broker replicates data from other broker nodes, that broker node allocates only one thread for this work, and that thread needs to complete the replication of all partition data for that broker. Experience shows that moving 1000 partitions from broker to broker introduces a time delay of about 20ms, which means an end-to-end delay of at least 20 ms. This delay is too long for some real-time applications.

Note that the above problem can be mitigated by increasing the kafka cluster. For example, if you put 1000 partition leaders into one broker node and 10 broker nodes, there is a difference in latency between them. In a cluster of 10 broker nodes, each broker node handles an average of 100 partitions of data replication. The end-to-end latency will change from tens of milliseconds to milliseconds.

As a rule of thumb, if you are concerned about message latency, it is a good idea to limit the number of partitions per broker node: for b broker nodes and a kafka cluster with replication factor r, the number of partitions in the entire kafka cluster should not exceed 100*b*r, that is, the number of leaders in a single partition should not exceed 100.

More partitions means more memory is needed on the client side

In the latest release of kafka, version 0.8.2, we have developed a more efficient Java producer. The new producer has a nice feature that allows users to set a maximum memory size for messages to be accessed. At the internal implementation level, the producer caches messages per partition. When the data has accumulated to a certain size or time, the accumulated messages will be removed from the cache and sent to the broker node.

If the number of partitions increases, messages will accumulate in more partitions at the producer end. The memory consumed by many partitions may exceed the set content size limit. When this happens, the producer must solve the problem by blocking messages or losing some new messages, but neither is ideal. To avoid this, we must reset produder memory to a larger size.

As a rule of thumb, in order to achieve good throughput, we must allocate at least a few tens of KB of memory per partition on the producer side and adjust the amount of memory available when the number of partitions increases significantly.

Something similar works for consumers. The Consumer side takes a batch of messages from kafka for consumption on a per-partition basis. The more partitions consumed, the greater the amount of memory required. However, the above methods are mainly applied to non-real-time application scenarios.

summary

In general, the more partitions in a kafka cluster, the higher the throughput. However, we must be aware of the potential impact on system availability and message latency of having too many partitions in a cluster or too many partitions in a single broker node. In the future, we plan to make some improvements to these limits to make kafka more scalable in terms of the number of partitions.

The above is how to set the number of partitions in kafka shared by Xiaobian. If there are similar doubts, please refer to the above analysis for understanding. If you want to know more about it, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.