What is the testing process of the delay comparison between Apache Pulsar and Kafka 07/04 Update SLTechnology News&Howtos

What is the testing process of the delay comparison between Apache Pulsar and Kafka

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article shows you what the testing process of the delayed comparison of Apache Pulsar and Kafka is like, the content is concise and easy to understand, and it will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

The test details are described below.

Test details:

> setting benchmark

To set up the benchmark tests, we followed the steps documented on the OpenMessaging site. After applying the Terraform configuration, you get the following set of EC2 instances: we conducted benchmarking based on the steps provided on the OpenMessaging website. The following set of EC2 instances can be obtained by applying Terraform configuration:

The i3.4xlarge instance for Pulsar/BookKeeper and Kafka broker contains two NVMe SSD to improve performance. Both powerful virtual machines have 16 vCPU, 122 GiB memory, and high-performance disks.

Two SSD are ideal for Pulsar, because not only can two data streams be written, but the data streams can also be parallel on disk. Kafka can also use the two SDD by assigning partitions to the two drives.

Ansible playbook for Pulsar and Kafka uses the `latency adm` command (latency performance profile) to tune low latency performance.

For more information on tuned-adm commands, please see https://linux.die.net/man/1/tuned-adm.

> workload

Although the benchmark comes with some workloads that can be run immediately, some modifications have been made to get closer to the test results of Kafka in the LinkedIn Engineering blog. Defining a new workload is not difficult, just create a YAML file with test update parameters.

If you read the LinkedIn blog, you will find that the message size they run is 100 bytes, because generally speaking, if the message is too small (much less than 100 bytes), the test comparison results are not obvious. All message queues are not good at dealing with "big messages" (much larger than 100 bytes), so here is a compromise size of 100 bytes, which is also the size of a single message selected for use in all message system tests.

This size is more useful for testing the performance of the messaging system itself. Regardless of the size of each message, the total amount of messages used for testing is fixed, and the more efficient the message system processes messages, the better the performance; at the same time, the less likely it is that network or disk throughput restrictions will affect the test results. The performance of the messaging system in dealing with "big messages" is also a topic worth discussing, but we are currently only testing "small messages".

In addition, in the test, we added a benchmark with a number of 6 partition (6 partitions for short). Because we used a lot of 6 partitions in the LinkedIn test, we added it as well.

The LinkedIn blog contains both producer-only and consumer-only workloads, while the workloads we used in our tests included both producer and consumer. There are two reasons.

First, as things stand, the benchmark does not support producer-only or consumer-only workloads; second, in practice, the messaging system serves both producer and consumer. We decided to use the actual scenario of production and consumption messages to test.

To sum up, the load set we used for testing is as follows:

Kafka consumer group and Pulsar subscriptions are similar in that both allow one or more consumer to receive all messages on one topic. When a topic is associated with multiple consumer group/subscription, the messaging system provides the topic with multiple copies of each message, or "fan out" messages.

Every message posted on topic is sent to all consumer group/subscription. If all messages are sent to the same topic and there is only one consumer group/subscription on this topic, the producer rate is equal to the consumer rate.

If there are two consumer group/subscription on a single topic, the consumer rate is twice as fast as the producer rate. We tried to simplify the test, so we adopted the former, that is, multiple consumer received all messages on a topic.

The above content is what the test process of Apache Pulsar and Kafka delay comparison is like, have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.