In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article focuses on "how to understand distributed transactions". Interested friends may wish to take a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to understand distributed transactions.
Business
When it comes to distributed transactions, first of all, we should start with the basic characteristics of transactions.
An atomicity: during the execution of a transaction, either all are successful or none are successful.
C consistency: transactions cannot break the integrity of data before and after execution. Consistency is more said to achieve the goal through AID, the data should be in line with the pre-definition and constraints, guaranteed by the application level, and some say that C is forcibly cobbled together for ACID.
I isolation: multiple transactions are isolated from each other, transactions can not interfere with each other, involving the isolation level of different transactions.
D persistence: once the transaction commits, the state of the data in the database should be permanent.
XA
XA (eXtended Architecture) refers to the specification of distributed transaction processing proposed by the X/Open organization. It is a specification or protocol that defines transaction manager TM (Transaction Manager), resource manager RM (Resource Manager), and application.
Transaction manager TM is the coordinator of transactions, and resource manager RM can be thought of as a database.
2PC
XA defines the specification, so 2PC and 3PC are his concrete implementation.
2PC is called two-phase commit, which is divided into two stages: the voting phase and the execution phase.
Voting stage
TM sends an prepare request to all participants, asking if the transaction can be executed and waiting for a response from each participant.
This phase can be thought of as just executing the SQL statement of the transaction, but not yet committed.
Return YES if all execution is successful, otherwise return NO.
Execution phase
The execution phase is the real transaction commit phase, but take into account the failure.
If all participants return YES, the send commit command is executed, and the commit transaction is executed after the participant receives it.
Conversely, as long as any participant returns NO, the rollback command is sent and the rollback operation is performed.
Defects of 2PC
Synchronous blocking, you can see that during the execution of the transaction, all database resources are locked. If someone else accesses these resources at this time, it will be blocked, which is a big performance problem.
TM single point problem, as long as there is only one TM, once the TM goes down, then the whole process can not be completed.
Data inconsistencies arise if participants' brain cracks or other failures result in not receiving commit requests, partially committed transactions and partially uncommitted data during the execution phase.
3PC
Since 2PC has so many problems, it gives rise to the concept of 3PC, which is also called three-phase commit. He divides the whole process into three steps: CanCommit, PreCommit, and DoCommit. Compared to 2PC, what is added is the CanCommit phase.
CanCommit
This phase is to first ask the database whether to execute the transaction, send a request for canCommit to ask, and return YES if possible, and NO if possible.
PreCommit
This phase is equivalent to the voting phase of 2PC. Send the preCommit command and then execute the SQL transaction. If you succeed, you will return YES, and vice versa.
However, the difference in this place is that the participant has a timeout mechanism, and if the participant does not receive the doCommit command, the transaction will be committed by default.
DoCommit
The DoCommit phase corresponds to the execution phase of the 2PC. If the YES was received in the previous phase, the doCommit command is sent to commit the transaction, otherwise the abort command is sent to interrupt the transaction execution.
Compared to the improvement of 2PC
For the problem of synchronous blocking of 2PC, we can see that because 3PC adds the timeout mechanism of participants, the problem time of synchronous blocking caused by the failure of a participant in 2PC is shortened, which is an optimization, but it is not completely avoided.
The second problem of single point failure is also optimized to some extent because of the introduction of timeout mechanism.
However, the problem of data inconsistency has not been solved.
Take a chestnut:
In the PreCommit phase, a participant has a brain fissure and cannot receive the request from TM. At this time, other participants perform abort transaction rollback, and the participants with brain fissure continue to commit transactions after timeout, which may lead to data inconsistency.
So why join the DoCommit phase? In order to introduce the timeout mechanism, we first confirm whether all databases can execute transactions. If they are all OK, then we will proceed to the following steps. So since they can be executed, we will automatically commit the transaction after the timeout indicates that a problem has occurred.
TCC
The pattern of TCC is called Try, Confirm, Cancel, which is actually just a variant of 2PC.
To implement this pattern, the interface of a transaction needs to be split into three, that is, Try preemption, Confirm confirm commit, and finally Cancel rollback.
For TCC, I have never seen anyone use it in actual production. Considering the reasons, first of all, the quality of the programmer is uneven, it is difficult for you to restrict others to follow your rules when working with multiple teams, and the other point is that it is too complicated.
If there is a simple application, the application of inventory may be counted as one.
In the general operation of inventory, many implementation solutions will preempt inventory when issuing an order, and then actually deduct inventory after issuing an order successfully, and finally fall back if an exception occurs.
Freezing and preempting inventory is the preparation stage of 2PC. The real success of issuing an order to deduct inventory is the submission stage of 2PC. Rollback is a rollback operation that occurs an exception, but implements the 2PC mechanism at the application level.
SAGA
Saga comes from a paper on how to deal with long lived transaction (long-term Affairs) published by Hecto and Kenneth of Princeton University in 1987.
The main idea is to split the long transaction into multiple local short transactions.
If all execution is successful, it is completed normally, otherwise, the compensation is called in reverse order.
The SAGA model has two recovery strategies:
Forward recovery, this mode is biased towards scenarios that must succeed, and failure will retry
Restore backwards, that is, the sub-transactions that have exceptions roll back compensation in turn
Since no one has used this model in China, I will not repeat it.
Message queue
The solution to achieve final consistency based on message queue is a little more reliable than the previous one. Those are all theories, ah, and the implementation of normal production rarely sees applications.
There may be a little more message queuing-based applications.
There are generally two ways to do this, based on the local message table and transaction messages that rely on MQ itself.
The scheme of the local message table is actually more complicated, and in fact I haven't seen who actually uses it. Here I will take the transaction message of RocketMQ as an example. Compared with the local message table, this approach is more completely decoupled by the characteristics of MQ itself, releasing the complex workload of business development.
The service initiator calls the remote interface to send a semi-transactional message to MQ. After receiving the message, MQ will return an ACK to the producer.
After receiving the ACK, the producer executes the transaction, but the transaction has not yet been committed.
The producer will decide whether to send a commit commit or rollback rollback to MQ based on the execution result of the transaction
This is when an exception occurs, such as producer downtime or other exceptions that cause MQ not to receive messages from commit or rollback for a long time, and MQ will initiate a status check.
If MQ receives commit, it will deliver the message, and consumers can consume the message normally. If it is rollback, the message will be deleted within the fixed time period set.
This scheme is based on MQ to ensure the final consistency of message transactions, which is still a reasonable solution. As long as the reliability of MQ is guaranteed, the application can be implemented normally, and the business consumers can retry to achieve the final consistency according to their own messages.
Frame
The above is all about theory and self-implementation, so there is no framework for distributed transactions to solve our problems?
There are, in fact, quite a few, but there is no one who can carry the flag. to say yes, there are Ali's open source framework Seata and Ali Yun's GTS.
GTS (Global Transaction Service Global transaction Service) is a middleware product of Aliyun. As long as you use Aliyun, you can use GTS for payment.
Seata (Simple Extensible Autonomous Transaction Architecture) is an open source distributed transaction framework that provides support for TCC, XA, Saga, and AT patterns.
So, what does GTS have to do with Seata?
In fact, at the beginning, they were all based on Ali's internal TXC (Taobao Transaction Constructor) distributed middleware products, and then TXC was modified to Aliyun, which is called GTS.
After that, Ali's middleware team made an open source Seata based on TXC and GTS, in which the AT (Automatic Transaction) pattern is the original solution of GTS.
As for the current version, it can be roughly assumed that they are the same, and by 2020, GTS will be fully compatible with the GA version of Seata.
The entire GTS or Seata consists of the following core components:
Transaction Coordinator (TC): transaction coordinator, which maintains the running state of global transactions and is responsible for coordinating and driving the commit or rollback of global transactions.
Transaction Manager (TM): controls the boundaries of a global transaction, is responsible for opening a global transaction, and ultimately initiates a global commit or global rollback resolution.
Resource Manager (RM): control branch transactions, responsible for branch registration, status reporting, and receive instructions from the transaction coordinator to drive the commit and rollback of branch (local) transactions.
The principle of the whole distributed transaction is relatively easy to understand, whether for TCC or original AT mode support.
When the transaction is on, TM registers the global transaction with TC and obtains the global transaction XID
At this time, when the interfaces of multiple microservices are called, the XID will be propagated to each microservice, and each microservice execution transaction will register the branch transaction with TC.
TM can then manage the global commit and rollback of transactions for each XID, and RM completes the branch commit or rollback.
AT mode
Compared with the scheme of TCC, the original AT mode does not need to implement multiple interfaces, generates UNDO_LOG before and after update in the form of proxy data source, and relies on UNDO_LOG to realize the rollback operation.
The process of execution is as follows:
TM registers global transactions with TC to get XID
RM will proxy the JDBC data source, generate the mirrored SQL, form the UNDO_LOG, then register the branch transaction with TC, and commit the data update and UNDO_LOG together in the local transaction.
If TC receives a commit request, it will asynchronously delete the UNDO_LOG of the corresponding branch. If it is rollback, it will query the UNDO_LOG of the corresponding branch and perform the rollback through UNDO_LOG.
TCC mode
TCC is a little simpler than the way AT schema proxies JDBC data sources generate UNDO_LOG to generate inverse SQL rollbacks.
TM registers global transactions with TC to get XID
RM registers the branch transaction with TC, then executes the Try method, and reports the execution of the Try method
Then execute the Confirm method if you receive a commit request from TC, and execute Cancel if you receive a rollback
XA mode
TM registers global transactions with TC to get XID
RM registers branch transactions with TC, XA Start, executes SQL,XA END,XA Prepare, and then reports branch execution
Then execute the Confirm method if you receive a commit request from TC, and execute Cancel if you receive a rollback
SAGA mode
TM registers global transactions with TC to get XID
RM registers branch transactions with TC, then executes business methods, and reports branch execution
RM receives the branch rollback and executes the corresponding business rollback method
At this point, I believe you have a deeper understanding of "how to understand distributed transactions". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.