How to ensure that Kafka does not lose messages 07/15 Update SLTechnology News&Howtos

How to ensure that Kafka does not lose messages

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "how to ensure that Kafka does not lose messages". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

How does kafka guarantee not to lose messages?

Ps: this article feels like a lot of vernacular! I hope you can get something after you read it.

Friends who do not know Kafka are advised to take a look at my following articles. The first one must be read, and the others can be learned on demand.

An introduction! I'll introduce you to Kafka in vernacular! 5 minutes to experience the third article of the KafkaKafka series! 10 minutes to learn how to use Kafka as a message queue in a Spring Boot program? The loss of messages by the producer

After the producer (Producer) calls the send method to send the message, the message may not be sent because of network problems.

Therefore, we cannot default to sending the message successfully after calling the send method to send the message. In order to determine whether the message was sent successfully, we need to determine the result of the message. Note, however, that the Kafka producer (Producer) using the send method to send messages is actually an asynchronous operation. We can get the result of the call through the get () method, but this also makes it a synchronous operation. The sample code is as follows:

For the detailed code, see my article: the third article in the Kafka series! 10 minutes to learn how to use Kafka as a message queue in a Spring Boot program?

SendResult sendResult = kafkaTemplate.send (topic, o) .get ()

If (sendResult.getRecordMetadata ()! = null) {

Logger.info ("producer successfully sent message to" + sendResult.getProducerRecord () .topic () + "- >" + sendRe

Sult.getProducerRecord (). Value (). ToString ()

}

But this is generally not recommended! You can add a callback function to it. The sample code is as follows:

ListenableFuture future = kafkaTemplate.send (topic, o)

Future.addCallback (result-> logger.info ("message that the producer successfully sent to topic: {} partition: {}", result.getRecordMetadata () .topic (), result.getRecordMetadata () .partition ())

Ex-> logger.error ("producer failed to send cancellation, reason: {}", ex.getMessage ()

If the message fails, we can check the cause of the failure and resend it!

In addition, it is recommended to set a reasonable value for the retries (number of retries) of Producer, which is usually 3, but it is generally set to a higher value to ensure that the message is not lost. After the setup is completed, when there is a network problem, you can automatically retry message sending to avoid message loss. In addition, it is recommended to set the retry interval, because if the interval is too small, the effect of retry is not obvious. If the network fluctuates three times, you will retry all at once.

The situation of consumers losing messages

We know that messages are assigned a specific offset (offset) when they are appended to the Partition. The offset (offset) indicates the location of the Partition (partition) to which the Consumer is currently consumed. Kafka ensures the ordering of messages within the partition by offsetting (offset).

Kafka offset

When the consumer pulls a message for the partition, the consumer automatically submits the offset. There will be a problem with automatic submission. Imagine that when consumers get the message ready for real consumption, they suddenly hang up. The message is not actually consumed, but the offset is automatically submitted.

The solution is also rough. We manually turn off the auto-submit offset, and then manually submit the offset ourselves after we actually consume the message. However, careful friends will find that this will lead to the problem that messages are re-consumed. For example, if you have just finished consuming a message and haven't submitted an offset, and then hang up yourself, then the message will theoretically be consumed twice.

Kafka lost the message.

We know that Kafka introduced a multiple copy (Replica) mechanism for Partition. Between multiple replicas in a Partition, there will be a guy named leader, and the other copies will be called follower. The message we send is sent to the leader copy, and then the follower copy can pull the message from the leader copy for synchronization. Producers and consumers only interact with leader copies. You can understand that other copies are just copies of leader, and they exist only to ensure the security of the message store.

Imagine a situation: if the broker where the leader copy is located suddenly dies, a leader will be re-selected from the follower copy, but some of the leader data will be lost if it is not synchronized by the follower copy.

Set acks = all

The solution is to set acks = all. Acks is an important parameter for Kafka producer (Producer).

The default value of acks is 1, which means that our message is sent successfully after it is received by the leader copy. When we configure the acks = all representative, all copies will receive the message before the message is really sent successfully.

Set replication.factor > = 3

To ensure that leader replicas have follower replicas that can synchronize messages, we usually set replication.factor > = 3 for topic. This ensures that each partition (partition) has at least three copies. Although it causes data redundancy, it brings data security.

Set min.insync.replicas > 1

In general, we also need to set min.insync.replicas > 1 so that the configuration means that the message must be written to at least 2 copies before it is sent successfully. The default value of min.insync.replicas is 1, which should be avoided as far as possible in actual production.

However, to ensure the high availability of the entire Kafka service, you need to make sure that replication.factor > min.insync.replicas. Why? Imagine that if the two are equal, as long as one copy is dead, the entire partition will not work properly. This is a clear violation of high availability! It is generally recommended to set it to replication.factor = min.insync.replicas + 1.

Set unclean.leader.election.enable = false

Since Kafka version 0.11.0.0, the default value of unclean.leader.election.enable parameter has been changed from true to false.

We also said at the beginning that the message we sent would be sent to the leader copy before the follower copy could pull the message from the leader copy for synchronization. Message synchronization is different among multiple follower replicas. When we configure unclean.leader.election.enable = false, when leader replicas fail, leader will not be selected from follower replicas and leader replicas that do not reach the required synchronization level, thus reducing the possibility of message loss.

This is the end of the content of "how to ensure that Kafka does not lose messages". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.