How do RabbitMQ and Kafka ensure the reliable transmission of message queues 07/06 Update SLTechnology News&Howtos

How do RabbitMQ and Kafka ensure the reliable transmission of message queues

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article focuses on "how to ensure the reliable transmission of message queues with RabbitMQ and Kafka". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "how to ensure the reliable transmission of message queues by RabbitMQ and Kafka".

Interview questions

How to ensure the reliable transmission of messages? Or, how to deal with the problem of message loss?

Psychological analysis of interviewer

This is for sure. There is a basic principle in using MQ, that is, there can be no more data, no less, no more, that is, the problem of repeated consumption and idempotency mentioned earlier. No less, that is to say, don't lose the data. Then you have to think about it.

If you use MQ to convey very core messages, such as billing and deduction messages, you must make sure that the billing messages are never lost in the MQ delivery process.

Analysis of interview questions

The problem of data loss may appear in producers, MQ and consumers. Let's analyze it from RabbitMQ and Kafka respectively.

RabbitMQ

The producer lost the data.

When the producer sends the data to RabbitMQ, the data may be lost halfway, because of network problems and so on.

At this point, you can choose to use the transaction function provided by RabbitMQ, that is, before the producer sends the data, enable the RabbitMQ transaction channel.txSelect, and then send the message. If the message is not successfully received by the RabbitMQ, the producer will receive an exception error. At this time, you can roll back the transaction channel.txRollback, and then try sending the message again. If you receive the message, you can commit the transaction channel.txCommit.

/ / start transaction channel.txSelecttry {/ / send message here} catch (Exception e) {channel.txRollback / / resend this message here} / / commit transaction channel.txCommit

But the problem is that when the RabbitMQ transaction mechanism (synchronization) is implemented, the throughput will basically come down because it consumes too much performance.

So generally speaking, if you want to make sure that you don't lose RabbitMQ messages, you can turn on confirm mode. After setting confirm mode on at the producer, each message you write will be assigned a unique id, and then if it is written into RabbitMQ, RabbitMQ will send you an ack message telling you that the message is ok. If RabbitMQ fails to process the message, it will call back one of your nack interfaces to tell you that the message failed, and you can try again. And you can use this mechanism to maintain the status of each message id in memory, and if you haven't received a callback for a certain amount of time, you can resend it.

The biggest difference between the transaction mechanism and the cnofirm mechanism is that the transaction mechanism is synchronous. After you commit a transaction, it will block there, but the confirm mechanism is asynchronous. After you send a message, you can send the next message. After receiving that message, RabbitMQ will call back one of your interfaces asynchronously to inform you that the message has been received.

Therefore, the confirm mechanism is generally used to avoid data loss in the producer.

RabbitMQ lost the data.

The RabbitMQ itself has lost the data. You have to enable the persistence of the RabbitMQ, that is, after the message is written, it will be persisted to disk. Even if the RabbitMQ itself dies, the previously stored data will be automatically read after recovery. Generally, the data will not be lost. Unless it is extremely rare that RabbitMQ dies before it is persisted, it may result in a small amount of data loss, but this probability is small.

There are two steps to setting up persistence:

Set queue to be persistent when you create it

This ensures that RabbitMQ persists the metadata of queue, but it does not persist the data in queue.

The second is to set the deliveryMode of the message to 2 when sending the message.

The message is set to be persisted, and RabbitMQ persists the message to disk.

You must set these two persistence at the same time. Even if RabbitMQ is hung up and restarted again, it will restart the recovery queue from disk and recover the data in this queue.

Note that even if you enable the persistence mechanism for RabbitMQ, there is a possibility that this message is written to RabbitMQ, but before it can be persisted to disk, as a result, RabbitMQ dies, resulting in a little bit of data loss in memory.

Therefore, persistence can be combined with the confirm mechanism of the producer. Only after the message is persisted to disk will the producer be notified of ack. So even if the RabbitMQ is hung up, the data is lost and the producer cannot receive the ack before persistence to disk, you can still resend it yourself.

The consumer lost the data

If RabbitMQ loses data, it is mainly because when you consume, you have just consumed it and haven't processed it yet, and as a result, the process has failed, for example, if you restart, it will be embarrassing. RabbitMQ thinks that you have already consumed it, and the data will be lost.

At this time, you have to use the ack mechanism provided by RabbitMQ, to put it simply, you have to turn off the automatic ack of RabbitMQ, you can call it through an api, and then ack it in the program every time you make sure it's done in your own code. In that case, if you haven't finished dealing with it, won't there be no ack? Then RabbitMQ thinks that you haven't finished dealing with it. At this time, RabbitMQ will assign this consumption to another consumer to deal with, and the message will not be lost.

The Kafka consumer lost the data.

The only situation that may cause consumers to lose data is that you consume the message, and then the consumer automatically submits the offset to make Kafka think that you have consumed the message, but in fact, you are just ready to process the message, and you hang up before you handle it, and the message is lost at this time.

This is similar to RabbitMQ. Everyone knows that Kafka will automatically submit offset, so as long as you turn off auto-submit offset and manually submit offset after processing, you can ensure that the data will not be lost. However, it is true that there may be repeated consumption at this time. For example, if you have just finished processing and have not yet submitted the offset, you will surely repeat the consumption at this time and ensure that you are idempotent.

A problem encountered in the production environment is that after our Kafka consumers consume the data, they write it to an in-memory queue to buffer it. As a result, sometimes you just write the message to the in-memory queue, and then the consumer will automatically submit the offset. Then we restart the system at this time, which will cause the data in the memory queue to be lost before it can be processed.

Kafka lost the data.

One of the more common scenarios is when a broker goes down in Kafka, and then the leader of the partition is re-elected. Let's think about this: if there happens to be some data out of sync in other follower at this time, as a result, leader is dead at this time, and then after electing a certain follower into leader, will there be less data? That's why we lost some data.

It has also been encountered in the production environment, and so have we. Kafka's leader machine went down before. After switching from follower to leader, we will find that this data is lost.

Therefore, it is generally required to set at least the following four parameters:

Set the replication.factor parameter for topic: this value must be greater than 1, requiring at least 2 copies of each partition.

Set the min.insync.replicas parameter on the Kafka server: this value must be greater than 1. This requires a leader to perceive that at least one follower is still in contact with itself and is not left behind, so as to ensure that there is a follower when the leader is dead.

Set acks=all on the producer side: this requires that each piece of data must be written to all replica before it can be considered a successful write.

Set retries=MAX on the producer side (a large value, meaning unlimited retries): this requires that if the write fails, the retry will be infinite, which is stuck here.

Our production environment is configured in accordance with the above requirements. After configuration, at least on the Kafka broker side, we can ensure that the data will not be lost when the broker where the leader is located fails and the leader is switched over.

Will the producer lose the data?

If acks=all is set according to the above idea, it will not be lost. The requirement is that your leader receives the message and all the follower synchronizes to the message before the write is considered to be successful. If this condition is not met, the producer will automatically retry for an unlimited number of times.

At this point, I believe you have a deeper understanding of "how to ensure the reliable transmission of message queues with RabbitMQ and Kafka". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.