In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains "what is the transactional processing strategy of RabbitMQ,RocketMQ,Kafka". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is the transactional processing strategy of RabbitMQ,RocketMQ,Kafka".
RabbitMQ,RocketMQ,Kafka transactional, message lost Message sequencing and message repeat processing strategies message queue FAQs handling distributed transactions what are common distributed transaction solutions distributed transaction local message table based on MQ implementation-final consistency MQ transaction-final consistency RocketMQ how to handle transaction Kafka how to handle transaction message loss prevention in transaction RabbitMQ production phase to prevent message loss Anti-loss measures in RabbitMQ Anti-loss measures in Kafka Anti-loss measures in RocketMQ Anti-loss measures in Storage Phase RabbitMQ Anti-loss measures in RocketMQ Anti-loss measures in RocketMQ consumption Phase message repeat message sequence reference
Message queuing FAQs dealing with distributed transactions what is distributed transactions
Our server has developed from a single machine to a distributed system with multiple machines, each system needs to communicate with the help of the network, and the relatively reliable method calls and inter-process communication methods in the original single machine have no way to be used. at the same time, the network environment is also unstable, resulting in the problem of data synchronization between multiple machines, which is a typical distributed transaction problem.
In distributed transactions, transaction participants, transaction-supporting servers, resource servers and transaction managers are located on different nodes of different distributed systems. Distributed transaction is to ensure data consistency between different nodes.
Common distributed transaction solutions
1. 2PC (two-phase commit) solution-strong consistency
2. 3PC (three-phase submission) solution
3. TCC (Try-Confirm-Cancel) transaction-final consistency
4. Saga transaction-final consistency
5. Local message table-final consistency
6. MQ transaction-final consistency
Here we focus on using message queues to achieve distributed consistency. For details of the above distributed designs, please see the reference link at the end of the article.
Local message Table for distributed transactions based on MQ-final consistency
The producer of the message, in addition to maintaining its own business logic, also needs to maintain a message table. What is recorded in this message table is the information that needs to be synchronized to other services, and of course, in this message table, each message has a status value to identify whether the message has been successfully processed.
The business logic of sending and putting and the insertion of data in the message table will be completed in a transaction, which avoids the problem of business processing success + transaction message sending failure, or business processing failure + transaction message sending success.
Take a chestnut:
Let's assume that there are currently two services, the order service and the shopping cart service, where the user merges and places orders for several items in the shopping cart, and then needs to empty the item information that has just been placed in the shopping cart.
1. The producer of the message, that is, the order service, completes its own logic (placing an order for goods) and then sends the message to other services that need to be synchronized through mq, that is, the shopping cart service in our chestnut.
2. Other services (shopping cart service) will monitor this queue
1. If the message is received and the data synchronization is executed successfully, of course, this is also a local transaction, the producer (order service) message of the reply message through mq has been processed, and then the producer can identify that the transaction has ended. If it is a business error, reply to the producer of the message and need to roll back the data.
2. If you haven't received this message for a long time, this will not happen. The sender of the message will have a scheduled task and will regularly retry sending messages that have not yet been processed in the message table.
3. If the producer of the message (order service) receives the receipt of the message
1. If it is successful, the message has been processed, that is, the synchronization of the distributed transaction has been completed.
2. If the result of the message is that the execution failed and the transaction is rolled back locally, indicate that the message has been processed.
3. If the message is lost, that is, the receipt message is not received, this situation is unlikely to happen. The sender of the message (order service) will have a scheduled task to regularly retry sending messages that have not yet been processed in the message table. Downstream services need to be idempotent and may receive repeated messages many times. If a reply message producer loses one of the receipt information Then continuously receive the mq message from the manufacturer, and then reply to the producer's receipt message again, which always ensures that the sender can successfully receive the receipt, and the producer of the message should also be idempotent when receiving the receipt message.
Here are two very important operations:
1. The server needs to be idempotent in processing messages, and both the producer and receiver of messages need to be idempotent.
2. Send and play need to add a timer to traverse and repush unprocessed messages to avoid message loss and transaction execution fracture.
Advantages and disadvantages of this scheme
Advantages:
1. At the design level, it realizes the reliability of message data, does not rely on message middleware, and weakens the dependence on mq features.
2. Simple and easy to implement.
Disadvantages:
Mainly needs to bind with the business data, the coupling is high, uses the same database, will occupy some resources of the business database.
MQ transactions-final consistency
The following is an analysis of the transaction support of several message queues
How to deal with transactions in RocketMQ
A transaction in RocketMQ, which solves the problem of ensuring that both local transactions and messaging operations succeed or fail. In addition, RocketMQ adds a transaction reverse checking mechanism to improve the success rate of transaction execution and data consistency as much as possible.
There are mainly two aspects, normal transaction commit and transaction message compensation.
Normal transaction commit
1. Send a message (half message). The difference between this half message and a normal message is that it is not visible to the consumer until the transaction is committed.
2. MQ SERVER writes information and returns the result of the response.
3. According to the result of the MQ SERVER response, decide whether to execute the local transaction. If the MQ SERVER writes the information to execute the local transaction successfully, otherwise it will not be executed.
4. According to the execution status of the local transaction, decide whether to Commit or Rollback the transaction. MQ SERVER receives the Commit, and then delivers the message to the downstream subscription service, which can synchronize the data. If it is Rollback, the message will be lost.
If MQ SERVER does not receive a message from Commit or Rollback, this situation requires a compensation process.
Compensation process
1. If MQ SERVER does not receive a Commit or Rollback message from the sender, it will launch a query to the sender, that is, our server, to query the status of the current message.
2. The message sender receives the corresponding query request, queries the status of the transaction, and then pushes the status back to MQ SERVER,MQ SERVER for subsequent processes.
Instead of handling distributed transactions with local message tables, MQ transactions are completed in MQ with logic that should have been processed in the local message table.
How to deal with transactions in Kafka
Transactions in Kafka solve problems, ensuring that multiple messages sent in a transaction either succeed or fail. That is, to ensure the atomicity of writing operations to multiple partitions.
The Exactly Once of Kafka is realized by cooperating with the idempotent mechanism of Kafka, which satisfies the application program of read-process-write mode. Of course, transactions in Kafka are mainly used to deal with this mode.
What is the read-process-write mode?
Li such as: in stream computing, Kafka is used as the data source, and the calculation results are saved to Kafka. The data is consumed from a topic in Kafka, calculated in the computing cluster, and then saved in other topics in Kafka. In this process, make sure that each message is processed only once, so as to ensure the success of the final result. The atomicity of Kafka transactions ensures the atomicity of read and write, either succeed together or fail rollback together.
Here's an analysis of how Kafka transactions are implemented.
Its implementation principle is similar to that of RocketMQ transactions, which is based on two-phase commit, and may be more troublesome in implementation.
Let's first introduce the transaction coordinator. In order to solve the problem of distributed transactions, Kafka introduces the role of transaction coordinator, which is responsible for coordinating the whole transaction on the server side. This coordinator is not a separate process, but part of the Broker process, and the coordinator, like the partition, ensures its availability through elections.
There is also a special topic for recording transaction logs in the Kafka cluster, which records transaction logs. At the same time, there will be multiple coordinators, each responsible for managing and using several partitions in the transaction log. This can execute transactions in parallel and improve performance.
Let's take a look at the specific process.
1. First, when opening a transaction, the producer sends a request to the coordinator to open the transaction, and the coordinator records the transaction ID in the transaction log.
2. Then the producer begins to send transaction messages to the coordinator, but you need to send a message first to tell the coordinator which topic and partition you are in, and then send transaction messages normally. Unlike RocketMQ, these transaction messages are saved in a special queue. Kafka uncommitted transaction messages, like ordinary messages, only rely on the client to filter when consuming.
3. When the message is sent, the producer commits or rolls back the transaction to the coordinator according to the status of his execution.
Commit of transaction
1. The coordinator sets the status of the transaction to PrepareCommit and writes it to the transaction log
2. The coordinator writes the identification of the end of the transaction in each partition, and then the client can release the previously filtered uncommitted transaction message to the consumer for consumption
Rollback of transaction
1. The coordinator sets the status of the transaction to PrepareAbort and writes it to the transaction log
2. The coordinator writes the identity of the transaction rollback in each partition, and then the previously uncommitted transaction messages can be discarded.
Here is a quote from [pictures from the class of message queuing masters]
Transactions in RabbitMQ
The problem solved by transactions in RabbitMQ is to ensure that the producer's messages reach MQ SERVER, which is somewhat different from other MQ transactions, and will not be discussed here.
Message loss prevention
Let's first analyze the stage in which the next message flows through the MQ.
Production phase: the producer generates messages and sends them to the Broker side through the network.
Storage phase: Broker needs to uninstall the disk when it gets the message. If it is a cluster version of MQ, it also needs to synchronize data to other nodes.
Consumption stage: the consumer pulls the data at the Broker end and transmits it to the consumer side through the network.
Prevent message loss during production
Network packet loss and network failure will lead to the loss of messages.
Anti-loss measures in RabbitMQ
1. For perceptible errors, we catch them and then resubmit them
2. Through the transaction resolution in RabbitMQ, the transaction in RabbitMQ solves the problem of message loss in the production phase.
Before the producer sends the message, start a transaction through channel.txSelect, then send the message, if the message delivery server fails, roll back the channel.txRollback, then resend it, and if the server receives the message, commit the transaction channel.txCommit
However, the performance of the transaction is not good, which is a synchronous operation. After a message is sent, the sender will block to wait for the RabbitMQ Server's response, and then the next message can be sent. The throughput and performance of the producer production message will be greatly reduced.
3. Use the mechanism of sending confirmation.
Using the acknowledgment mechanism, the producer sets the channel to confirm acknowledgment mode. Once the channel enters confirm mode, all messages posted on that channel will be assigned a unique ID (starting from 1). Once the message is delivered to all matching queues, RabbitMQ will send an acknowledgment (Basic.Ack) to the producer (including the unique deliveryTag and multiple parameters of the message) This allows the producer to know that the message has correctly arrived at its destination.
Multiple indicates batch message acknowledgement for true. When it is true, the message id that is less than or equal to the returned deliveryTag has been confirmed, and for false, the message id is the returned deliveryTag message, which has been confirmed.
There are three types of confirmation mechanisms.
1. Synchronous confirmation
2. Batch confirmation
3. Async confirmation
Synchronization mode is inefficient because each message has to wait for confirmation before the next one can be processed.
The batch confirmation mode is more efficient than the synchronous mode, but there is a fatal flaw. Once the reply confirmation fails, all the messages of the current confirmation batch will be re-sent, causing the message to be sent repeatedly.
Asynchronous mode is a good choice, there is no blocking problem of synchronous mode, and it is also very efficient, so it is a good choice.
Anti-loss measures in Kafka
A broker has been introduced into Kafaka. Broker will confirm the message to the producer and consumer, and the producer will send the message to broker. If he does not receive the confirmation from broker, he can choose to continue to send it.
As long as Producer receives a confirmation response from Broker, you can guarantee that the message will not be lost during the production phase. Some message queues automatically retry after they have not received a confirmation response for a long time, and if the retry fails again, they will inform the user with a return value or an exception.
Message loss can be avoided as long as the acknowledgment response of the Broker is handled correctly.
Loss prevention measures in RocketMQ use SYNC to send messages and wait for the result of broker processing.
RocketMQ provides three ways to send messages, which are:
Synchronous send: Producer sends a message to broker, blocking the current thread from waiting for the broker response to send the result.
Asynchronous sending: Producer first builds a task to send messages to broker, submits the task to the thread pool, and so on, calls back the user-defined callback function to execute the processing result.
Oneway send: the Oneway mode is only responsible for sending the request, not waiting for the reply, while the Producer is only responsible for sending the request, not processing the response result.
Using transactions, transactions in RocketMQ, it solves the problem of ensuring that both local transactions and messaging operations succeed or fail.
Storage phase
Under normal circumstances in the storage phase, as long as Broker is running normally, there will be no problem of losing messages, but if there is a failure of Broker, such as a dead process or a server down, messages may still be lost.
Anti-loss measures in RabbitMQ
To prevent message loss during the storage phase, persistence can be done to prevent anomalies (restart, shutdown, downtime). No, no, no. no, no, no.
There are three parts in RabbitMQ persistence:
Persistence of switches
The persistence of the switch is achieved by setting the durable parameter to true when declaring the queue. If the persistence is not set, the information of the switch will be lost.
Queue persistence
The persistence of the queue is achieved by setting the durable parameter to true when declaring the queue. The persistence of the queue ensures that its own metadata will not be lost due to abnormal conditions, but there is no guarantee that the messages stored internally will not be lost.
Persistence of messages
For message persistence, specify delivery_mode=2 (1 is non-persistent) when delivering. Message persistence needs to be matched with the persistence of the queue. Only set message persistence. After restart, the queue disappears and the message will be lost. So it doesn't make much sense to set message persistence instead of queue persistence.
For persistence, if all messages are set to persist, it will affect the performance of writes, so you can choose to persist messages that require high reliability.
However, message persistence can not 100% avoid the loss of messages.
For example, when the data goes down in the process of dropping the disk, and the message is not synchronized into memory in time, the data will also be lost. This problem can be solved by introducing a mirror queue.
The role of the mirror queue: a mirror queue is introduced, but the queue has been mirrored to other Broker nodes in the cluster. If one node in the cluster fails, the queue can automatically switch to another node in the mirror to ensure the availability of the service. (we won't discuss it here in more detail.)
Anti-loss measures in Kafka
The operating system itself has a cache called Page Cache, and when writing to a disk file, the system first writes the data stream to the cache.
After receiving the message, Kafka will first store it in the Page Cache, and then the operating system will flush the disk according to its own policy or force it through the fsync command. If the system dies, the data in PageCache will be lost. The data in the corresponding Broker will be lost.
Dealing with ideas
1. Control the Broker of leader. If a Broker lags too much behind the original Leader, then once it becomes the new Leader, it will inevitably result in the loss of messages.
2. Control messages can be written to multiple copies before they can be submitted, so as to avoid the above problem 1.
Anti-loss measures in RocketMQ
1. Change the mode of brushing to synchronous brushing.
2. For Broker with multiple nodes, the Broker cluster needs to be configured to send messages to at least 2 nodes, and then send a confirmation response to the client. In this way, when a Broker goes down, other Broker can replace the down Broker without message loss.
Consumption stage
The consumption phase is very simple. If the message is lost in the network transmission, the message will continue to be pushed to the consumer. In the consumption phase, we only need to confirm the consumption after the completion of the business logic processing.
Summary: for the loss of messages, we can also use the idea of local message table to drop messages when messages are generated, and messages that have not been processed for a long time can be pushed back to the queue at regular intervals.
The message is sent repeatedly
The delivery of messages in MQ can be roughly classified into the following three categories:
1. At most once: one at most. When a message is delivered, it will be delivered at most once. It is not safe and data may be lost.
2. At least once: at least once. When a message is delivered, it is delivered at least once. That is, messages are not allowed to be dropped, but a small number of duplicate messages are allowed.
3. Exactly once: just once. When a message is delivered, it will only be delivered once, and neither loss nor repetition is allowed. This is the highest level.
Most message queues satisfy At least once, that is, duplicate messages can be allowed.
We consumers need to be idempotent, and there are usually the following solutions
1. Make use of the uniqueness of the database
According to the business situation, the unique value that can be determined in the business is selected as the unique key of the database, a new pipeline table is created, and then the business operation and the insertion of the pipeline table data are placed in the same transaction, if the pipeline meter data already exists, then the execution fails, thus ensuring idempotency. You can also query the data of the flow meter first, and then execute the business without the data, and insert the data of the flow meter. However, it is important to pay attention to the read and write delays in the database.
2. pre-conditions are added to update the database.
3. Bring a unique ID to the message
Each message is added with a unique ID, and the consumption of repeated messages is handled by adding a pipeline table in method 1 with the help of the uniqueness of the database.
The sequence of messages
To ensure the sequence of the message queue, the processing idea is that the producer ensures the order of joining the queue, and the consumer ensures the sequence of the message consumption after leaving the queue.
Producer
Because the messages in the same message queue are orderly, the producer can use the algorithm to put the data that needs to be consumed orderly and have the same data characteristics (for example, the message of the same order) into the same message queue, so that consumers can retrieve the data sequentially.
Using that algorithm, hashing is an excellent choice. If you are dealing with order business, you can hash the order number to ensure that the same order can always fall into the same queue. This ensures that the messages in the queue are orderly at the production side.
Of course, the message queue mentioned above is a general term:
RabbitMQ is to ensure that the data with the same data characteristics fall into the same queue, and the data in the same queue is ordered.
In Kafka, it ensures that the data with the same data characteristics fall into the same partition, and the messages of the same partition are ordered.
RocketMQ is to ensure that the data with the same data characteristics fall into the same MessageQueue, and the messages in the same MessageQueue are ordered.
The differences among the three message models can be found in the comparative analysis of RabbitMQ,RocketMQ,Kafka message models.
Consumer
If there is a single messenger and the consumer has only one thread to process the message, then ordering can be guaranteed because the messages in the queue are already in order.
However, the throughput is high, and the performance of individual consumers can not keep up with how to deal with it.
1. We can increase the number of queues. A queue corresponds to a consumer. A larger queue means an increase in the number of consumers.
2. Increase the number of threads for a single consumer. Because of the increase in the number of threads, the consumption speed of each thread is different, so here we also need to pass the algorithm to put the data with the same data characteristics into a fixed thread for consumption.
However, the processing of consumers may have to be judged according to the actual scenario, if there is not a distinguishable feature data, it can only be consumed by a single consumer, a single thread. At the same time, producers can only push messages to the same message queue.
Thank you for your reading, the above is the content of "what is the transactional processing strategy of RabbitMQ,RocketMQ,Kafka". After the study of this article, I believe you have a deeper understanding of what the transactional processing strategy of RabbitMQ,RocketMQ,Kafka is, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.