How to analyze the solution of distributed transaction 07/19 Update SLTechnology News&Howtos

How to analyze the solution of distributed transaction

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article shows you how to analyze the solution of distributed transactions, the content is concise and easy to understand, it will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

What is a transaction? The transaction consists of a set of operations that we hope will all be executed correctly, and if any of the steps in this set of operations go wrong, we need to roll back the previously completed operations. That is, all operations in the same transaction are either performed correctly or none at all. Four characteristics of transactions ACID when it comes to transactions, we have to mention the four famous characteristics of transactions.

Atomicity requires that a transaction is an indivisible unit of execution, and either all operations in the transaction are performed or none are performed.

Consistency requires that the integrity constraints of the database are not broken before and after the transaction begins.

The execution of isolated transactions is independent of each other, they do not interfere with each other, and one transaction does not see the data of another running transaction.

Persistence requires that after a transaction is completed, the execution result of the transaction must be persisted. Even if the database crashes, the result of the transaction commit will not be lost after the database is restored.

Note: transactions can only ensure the high reliability of the database, that is, after problems occur in the database itself, the data committed by the transaction can still be recovered; if it is not the failure of the database itself, such as the hard disk is damaged, then the data committed by the transaction may be lost. This belongs to the category of "high availability". Therefore, transactions can only ensure the "high reliability" of the database, and "high availability" requires the cooperation of the whole system.

The isolation level of transactions is extended here to give a detailed explanation of the isolation of transactions. In the four characteristics of transaction ACID, the required isolation is a strict sense of isolation, that is, multiple transactions are executed serially without any interference with each other. This does fully guarantee the security of the data, but in the actual business system, the performance of this approach is not high. Therefore, the database defines four isolation levels, and the isolation level is inversely proportional to the performance of the database. The lower the isolation level, the higher the database performance, while the higher the isolation level, the worse the database performance. Problems with transaction concurrent execution Let's take a look at the problems that may occur in the database under different isolation levels:

Update missing when two transactions executed concurrently update the same row of data, it is possible that one transaction will overwrite the update of the other transaction. Occurs when no lock operation is added to the database.

Dirty read data from one transaction to another that has not yet been committed. The data may be rolled back and invalidated. An error occurs if the first transaction is processed with invalid data.

The meaning of unrepeatable degree: a transaction reads the same row of data twice, but gets different results. It is specifically divided into the following two situations:

Virtual reading: when transaction 1 reads the same record twice, transaction 2 modifies the record so that transaction 1 reads a different record for the second time. Illusion: transaction 1 in the process of two queries, transaction 2 inserts and deletes the table, so that the result of the second query of transaction 1 changes.

What is the difference between unrepeatable reading and dirty reading? Dirty reading reads data that has not yet been committed, while data that cannot be read repeatedly is committed data, except that the data is modified by another transaction in the process of two reads.

The four isolation levels of the database have the following four isolation levels:

Read uncommitted read not committed at this level, when one transaction modifies a row of data, another transaction is not allowed to modify the row's data, but another transaction is allowed to read the row's data. Therefore, at this level, there will be no update loss, but dirty and non-repeatable reads will occur.

Read committed read commit at this level, uncommitted write transactions do not allow other transactions to access the row, so dirty reads do not occur; but transactions that read data allow other transactions to access the row's data, so they cannot be read repeatedly.

Repeatable read repeat read at this level, read transactions prohibit writing transactions, but read transactions are allowed, so there is no situation in which the same transaction reads different data twice (non-repeatable), and write transactions prohibit all other transactions.

Serializable serialization this level requires that all transactions must be executed serially, so all concurrency problems can be avoided, but it is inefficient.

The higher the isolation level, the better the integrity and consistency of the data, but the greater the impact on concurrency performance. For most applications, priority can be given to setting the isolation level of the database system to Read Committed. It can avoid dirty reading and has good concurrency performance. Although it can lead to concurrency problems such as unrepeatable reads, phantom reads, and second-class missing updates, in individual cases where such problems may occur, the application can use pessimistic or optimistic locks.

What is a distributed transaction? So far, the transactions introduced are local transactions based on a single database, and the current database only supports single-database transactions, not cross-database transactions. With the popularity of micro-service architecture, a large business system is often composed of several subsystems, and these subsystems have their own independent databases. Often a business process needs to be completed by multiple subsystems, and these operations may need to be done in a single transaction. In micro-service systems, these business scenarios are common. At this point, we need to support cross-database transaction support by some means on the database, which is often called "distributed transaction". Here is a typical example of a distributed transaction-the process of placing an order by a user. When our system adopts micro-service architecture, an e-commerce system is often divided into the following subsystems: commodity system, order system, payment system, points system and so on. The whole process of placing an order is as follows:

When the user browses the goods through the commodity system, he takes a fancy to a certain item and clicks to place an order. when the order system generates an order and the order is created successfully, the payment system provides the payment function when the payment is completed. the points system adds points for the user.

Steps 2, 3, and 4 above need to be completed in a transaction. For traditional monolithic applications, implementing transactions is very simple, as long as you put these three steps in a method A, and then identify the method with the @ Transactional annotation of Spring. Spring ensures that either all of these steps are completed or none of them are performed through transaction support in the database. But in this micro-service architecture, these three steps involve three systems and three databases, so we must support distributed transactions between the database and the application system through some cool techs. CAP theory CAP theory says that in a distributed system, it can only meet two requirements of C, A, P at most. The meaning of CAP:

C:Consistency consistency whether multiple copies of the same data are the same in real time. A:Availability availability: within a certain period of time-the system returns a clear result is called the system available. P:Partition tolerance partition fault tolerance distributes the same service in multiple systems, thus ensuring the downtime of one system, while other systems still provide the same services.

CAP theory tells us that in a distributed system, we can only choose two of the three conditions C, A, P at most. So the question is, which two conditions are more appropriate? For a business system, availability and partition fault tolerance are two conditions that must be met, and the two complement each other. There are two main reasons why business systems use distributed systems:

Improve the overall performance when the business volume soars and a single server can no longer meet our business needs, we need to use a distributed system and use multiple nodes to provide the same function, so as to improve the performance of the system as a whole. this is the first reason for using distributed systems.

If the single node or multiple nodes are in the same network environment, there will be a certain risk, in case the power outage of the computer room and natural disasters occur in the area, then the business system will be completely paralyzed. In order to prevent this problem, a distributed system is adopted to distribute multiple subsystems in different regions and different computer rooms, so as to ensure the high availability of the system.

This shows that partition fault tolerance is the foundation of distributed system, if partition fault tolerance can not be satisfied, it will be meaningless to use distributed system. In addition, availability is particularly important for business systems. Today, when talking about user experience, if the business system often has "system anomalies" and long response time, which greatly reduces the users' goodwill towards the system, in today's fierce competition in the Internet industry, competitors in the same field are not enumerated, and the intermittent unavailability of the system will immediately cause users to flow to competitors. Therefore, we can only gain system availability and partition fault tolerance at the expense of consistency. This is the BASE theory that will be introduced below. BASE theory CAP theory tells us a tragic but have to accept the fact that we can only choose two conditions in C, A, P. As for business systems, we often choose to sacrifice consistency in exchange for system availability and partition fault tolerance. However, it should be pointed out here that the so-called "sacrifice consistency" does not mean giving up data consistency completely, but sacrificing strong consistency for weak consistency. Next, let's introduce the BASE theory.

BA:Basic Available is basically available

In some cases of force majeure, the whole system can still guarantee "availability", that is, it can still return a definite result within a certain period of time. But the difference between "basic availability" and "high availability" is:

"certain time" can be appropriately extended when a promotion is held, the response time can be appropriately extended to some users to return a degraded page to some users to directly return a degraded page, thus relieving the pressure on the server. Note, however, that returning to the downgrade page still returns a clear result.

S:Soft State: the state of different copies of the same data that does not need to be consistent in real time. E:Eventual Consisstency: the status of different copies of the same data is ultimately consistent. It is not necessary to be consistent in real time, but it must be guaranteed to be consistent after a certain period of time.

Acid-base balance ACID can ensure the strong consistency of transactions, that is, the data is consistent in real time. This is no problem in local transactions, in distributed transactions, strong consistency will greatly affect the performance of distributed systems, so distributed systems can follow the BASE theory. However, different business scenarios of distributed systems have different requirements for consistency. For example, in a transaction scenario, strong consistency is required, and you need to follow the ACID theory, while in scenarios such as sending SMS verification codes after successful registration, real-time consistency is not required, so you can follow the BASE theory. Therefore, it is necessary to find a balance between ACID and BASE according to the specific business scenario. Distributed transaction protocols several protocols for implementing distributed transactions are introduced below. One of the difficulties of the two-phase commit protocol 2PC distributed system is how to ensure the consistency of multiple nodes in transactional operations under the architecture. To achieve this, the two-phase commit algorithm is based on the following assumptions:

In this distributed system, there is one node as the Coordinator and the other nodes as the Cohorts. And network communication can be carried out between nodes. All nodes use pre-written logs, and the logs are kept on reliable storage devices after being written, even if the damage to the nodes does not lead to the disappearance of log data. All nodes are not permanently damaged and can be recovered even after damage.

The first stage (voting stage):

The coordinator node asks all participant nodes if they can perform a submit operation (vote) and starts waiting for a response from each participant node. The participant node performs all transaction operations up to the query initiation and writes Undo information and Redo information to the log. (note: if successful, each participant has actually performed a transaction operation.) each participant node responds to the query initiated by the coordinator node. If the participant node's transaction operation actually succeeds, it returns a "consent" message; if the participant node's transaction operation actually fails, it returns an "abort" message.

The second phase (commit execution phase): when the corresponding message received by the coordinator node from all participant nodes is "agree":

The coordinator node issues an "commit" request to all participant nodes. The participant node formally completes the operation and releases the resources occupied during the entire transaction. The participant node sends a finish message to the coordinator node. After receiving a "done" message from all participant nodes, the coordinator node completes the transaction.

If the response message returned by any participant node in the first phase is aborted, or if the coordinator node is unable to get the response messages from all participant nodes before the challenge timeout in the first phase:

The coordinator node issues a rollback operation (rollback) request to all participant nodes. The participant node performs a rollback using the previously written Undo information and releases the resources consumed during the entire transaction. The participant node sends a rollback complete message to the coordinator node. The coordinator node cancels the transaction after receiving a "rollback complete" message from all participant nodes.

Regardless of the final outcome, the second phase ends the current transaction. Two-phase commit does seem to provide atomic operations, but unfortunately, two-phase commit has several disadvantages:

During execution, all participating nodes are transaction blocking. When participants occupy public resources, other third-party nodes have to be blocked to access public resources. The participant failed. The coordinator needs to assign an additional timeout mechanism to each participant, after which the entire transaction fails. (there are not many fault-tolerant mechanisms) the coordinator failed. The participants will block all the time. Additional backup machines are required for fault tolerance. (this can rely on the Paxos protocol to implement HA later) A problem that cannot be solved in the second phase: the coordinator goes down after sending the commit message, and the only participant who receives the message is also down. So even if the coordinator produces a new coordinator through the election protocol, the status of the transaction is uncertain, and no one knows whether the transaction has been committed.

To this end, Dale Skeen and Michael Stonebraker proposed a three-phase commit protocol (3PC) in "A Formal Model of Crash Recovery in a Distributed System". Three-phase commit protocol 3PC differs from two-phase commit in that there are two changes to it.

Introduce the timeout mechanism. At the same time, the timeout mechanism is introduced in both the coordinator and the participants. Insert a preparation phase in the first and second stages. It ensures that the status of the participating nodes is consistent before the final submission phase.

In other words, in addition to introducing the timeout mechanism, 3PC again divides the preparation phase of 2PC into two, so that the three-phase commit has three phases: CanCommit, PreCommit, and DoCommit.

The CanCommit phase the CanCommit phase of 3PC is actually very similar to the preparation phase of 2PC. The coordinator sends a commit request to the participant, and the participant returns the Yes response if it can be submitted, otherwise the No response is returned.

The transaction asks the coordinator to send a CanCommit request to the participant. Ask if a transaction commit operation can be performed. Then start waiting for a response from the participant. Response feedback after the participant receives the CanCommit request, normally, if he or she thinks the transaction can be executed smoothly, he or she returns the Yes response and enters the standby state. Otherwise, feedback No

The PreCommit phase coordinator decides whether the PreCommit operation of the transaction can be memorized according to the reaction of the participants. Depending on the response, there are two possibilities. If the coordinator gets a Yes response from all the participants, then the transaction is pre-executed.

The coordinator sends a PreCommit request to the participant and enters the Prepared phase.

After the transaction pre-commit participant receives the PreCommit request, the transaction operation is performed and the undo and redo information is recorded in the transaction log.

Response feedback if the participant successfully executes the transaction operation, the ACK response is returned and the final instruction begins to wait.

If any participant sends a No response to the coordinator, or if the coordinator does not receive a response from the participant after the wait timeout, the transaction is interrupted.

Send interrupt request coordinator sends an abort request to all participants.

Interrupt the transaction participant executes the interruption of the transaction after receiving the abort request from the coordinator (or after the timeout, still not receiving the request from the coordinator).

The doCommit phase, which makes a real transaction commit, can also be divided into two situations. The real transaction commit at this stage can also be divided into the following two situations. 3.1 execute submission

Send submit request Coordination receives the ACK response sent by the participant, then he will go from the pre-submitted state to the submitted state. And send a doCommit request to all participants. After the transaction commit participant receives the doCommit request, the formal transaction commit is performed. And release all transaction resources after the transaction commit is completed. Response feedback after the transaction is committed, an Ack response is sent to the coordinator. After the transaction coordinator receives the ack response from all participants, the transaction is completed.

3.2 if the interrupt transaction coordinator does not receive an ACK response from the participant (either the recipient sends an ACK response or the response timed out), the interrupt transaction will be executed.

Send interrupt request coordinator sends abort request to all participants

After receiving the abort request, the transaction rollback participant uses the undo information recorded in phase 2 to perform the transaction rollback operation, and releases all transaction resources after the rollback is completed.

Feedback results after completing the transaction rollback, the participant sends an ACK message to the coordinator

After receiving the ACK message from the participant, the interrupt transaction coordinator executes the interruption of the transaction.

Solutions for distributed transactions there are several solutions for distributed transactions:

Global message distributed transaction TCC best effort notification based on reliable message service

Scheme 1: global transaction (DTP model) Global transaction is implemented based on DTP model. DTP is a distributed transaction model-X/Open Distributed Transaction Processing Reference Model, which is proposed by X/Open organization. It specifies that three roles are required to implement distributed transactions:

AP:Application application system is the business system we developed. In the process of development, we can use the transaction interface provided by the resource manager to achieve distributed transactions.

TM:Transaction Manager transaction Manager

The implementation of the distributed transaction is completed by the transaction manager, which provides the operation interface of the distributed transaction for our business system to call. These interfaces are called TX interfaces. The transaction manager also manages all resource managers and schedules them together through the XA interface they provide to implement distributed transactions. DTP is only a set of specifications to implement distributed transactions, and there is no specific definition of how to implement distributed transactions. TM can use 2PC, 3PC, Paxos and other protocols to implement distributed transactions.

RM:Resource Manager Explorer

The objects that can provide data services can be resource managers, such as databases, message middleware, caching, etc. In most scenarios, the database is the resource manager in a distributed transaction. The resource manager can provide the transaction capability of a single database. Through the XA interface, they provide the commit and rollback capabilities of the database to the transaction manager to help the transaction manager to achieve distributed transaction management. XA is an interface defined by the DTP model to provide the transaction manager with the commit, rollback, and so on capabilities of the resource manager (the database). DTP is only a set of specifications to implement distributed transactions, and the specific implementation of RM is completed by database vendors.

Is there a distributed transaction middleware based on DTP model?

What are the advantages and disadvantages of the DTP model?

Scheme 2: distributed transaction based on reliable message service, which needs to be realized through message middleware. Suppose there are two systems An and B, which can handle task An and task B, respectively. At this point, there is a business process in system A, which needs to process task An and task B in the same transaction. Let's introduce the implementation of this kind of distributed transaction based on message middleware.

Before system A processes task A, it first sends a message to the message middleware and persists the message after receiving it, but does not deliver it. At this point, downstream system B still does not know the existence of the message. After the message middleware is persisted successfully, it returns an acknowledgement reply to system A; after system A receives the acknowledgement reply, it can start to process task A; after task A processing is completed, it sends a Commit request to the message middleware. After the request is sent, for system A, the processing of the transaction is over, and it can handle other tasks. However, the commit message may be lost in transit, so the message middleware does not deliver the message to system B, which leads to system inconsistency. This problem is accomplished by the transaction review mechanism of the message middleware, which will be described below. After receiving the Commit instruction, the message middleware delivers the message to system B, thus triggering the execution of task B. when task B is completed, system B returns an acknowledgement reply to the message middleware, telling the message middleware that the message has been successfully consumed. At this time, the distributed transaction is completed.

The above process can draw the following conclusions:

Message middleware plays the role of distributed transaction coordinator. After system A completes task A, there will be a certain time difference between task B and task B. In this time difference, the whole system is in a state of data inconsistency, but this temporary inconsistency is acceptable, because after a short period of time, the system can maintain data consistency and meet the BASE theory.

In the above process, if task A fails to process, you need to enter the rollback process, as shown in the following figure:

If system A fails to process task A, a Rollback request is sent to the message middleware. As with sending a Commit request, system A can assume that the rollback is complete and it can do something else. After receiving the rollback request, the message middleware discards the message directly and does not deliver it to system B, so that task B of system B is not triggered.

At this point, the system is in a consistent state because neither Task A nor Task B is executed.

The Commit and Rollback described above are ideal, but in a real system, both Commit and Rollback instructions may be lost in transit. So when this happens, how does message middleware ensure data consistency? The answer is the time-out inquiry mechanism.

In addition to realizing the normal business process, system An also needs to provide a transaction query interface for message middleware to call. When the message middleware receives a transactional message, it starts timing. If the timeout does not receive Commit or Rollback instructions from system A, it will actively call the transaction query interface provided by system A to inquire about the current status of the system. The interface returns three results:

Submit if the status obtained is submit, the message is delivered to system B. If the status of rollback is "rollback", the message is discarded directly. If the status obtained in the process is "in progress", continue to wait.

The timeout query mechanism of message middleware can prevent the system inconsistency caused by the loss of Commit/Rollback instruction in the upstream system during transmission, and can reduce the blocking time of the upstream system. As long as the upstream system issues the Commit/Rollback instruction, it can handle other tasks without waiting for a confirmation reply. The loss of Commit/Rollback instructions is compensated by the timeout query mechanism, which greatly reduces the blocking time of the upstream system and improves the concurrency of the system.

Let's talk about the reliability guarantee of the message delivery process. When the upstream system completes the task and submits the Commit instruction to the message middleware, it can handle other tasks. At this time, it can assume that the transaction has been completed, and then the message middleware * * will ensure that the message is successfully consumed by the downstream system! * * so how do you do this? This is guaranteed by the delivery process of message middleware. After delivering the message to the downstream system, the message middleware enters the blocking waiting state, and the downstream system immediately processes the task, and then returns the reply to the message middleware after the task processing is completed. After receiving the acknowledgement reply, the message middleware thinks that the transaction is finished! If the message is lost during delivery, or the acknowledgement reply of the message is lost on the way back, the message middleware will be redelivered after waiting for the acknowledgement timeout until the downstream consumer returns to the consumer to respond successfully. Of course, general message middleware can set the number and interval of message retries, for example, when the first delivery fails, retry every five minutes, a total of 3 retries. If the delivery still fails after 3 retries, then the message requires human intervention.

Some students may ask: why not roll back the message after the failure of message delivery, but keep trying to re-deliver it?

This involves the implementation cost of the whole distributed transaction system. We know that when system A will send Commit instructions to the messaging middleware, it will do something else. If the message delivery fails and needs to be rolled back, it is necessary to ask system A to provide a rollback interface in advance, which undoubtedly increases the additional development cost and increases the complexity of the business system. The design goal of a business system is to minimize the system complexity on the premise of ensuring the performance, so as to reduce the operation and maintenance cost of the system.

I wonder if you have found that upstream system A submits Commit/Rollback messages to message middleware in an asynchronous way, that is, when the upstream system submits the message, it can do something else, and then the submission and rollback will be completely left to the message middleware to complete, and fully trust the message middleware, thinking that it must be able to correctly complete the transaction commit or rollback. However, the process of message middleware delivering messages to downstream systems is synchronous. That is, after the message middleware delivers the message to the downstream system, it blocks the wait and cancels the blocking wait until the downstream system successfully processes the task and returns an acknowledgement reply. Why are the two inconsistent in design?

First of all, the asynchronous communication between the upstream system and message middleware is to improve the concurrency of the system. Business systems deal with users directly, and user experience is particularly important, so this asynchronous communication mode can greatly reduce the user waiting time. In addition, compared with synchronous communication, asynchronous communication has no blocking waiting for a long time, so the concurrency of the system is greatly increased. However, asynchronous communication may cause the loss of Commit/Rollback instructions, which is made up by the timeout query mechanism of message middleware. So why should synchronous communication be used between message middleware and downstream systems? Asynchronism can improve the performance of the system, but it will increase the complexity of the system, while synchronization reduces the degree of concurrency of the system, but the cost is low. Therefore, when the requirement of concurrency is not very high, or when the server resources are abundant, we can choose synchronization to reduce the complexity of the system. As we know, message middleware is a third-party middleware independent of business system. It does not have direct coupling with any business system, nor is it directly related to users. It is generally deployed on an independent server cluster. It has good scalability, so there is no need to worry too much about its performance. If the processing speed can not meet our requirements, we can add machines to solve it. Moreover, even if there is a certain delay in the processing speed of message middleware, that is acceptable, because the BASE theory introduced earlier tells us that we are pursuing ultimate consistency, not real-time consistency, so it is acceptable that the delay generated by message middleware leads to temporary inconsistency of transactions. Option 3: best effort notification (periodic proofreading) best effort notification, also known as periodic proofreading, is already included in option 2 and is introduced separately here, mainly for the integrity of the knowledge system. This scheme also requires the participation of message middleware, and the process is as follows:

After completing the task, the upstream system sends a message synchronously to the message middleware to ensure that the message middleware successfully persists the message, and then the upstream system can do something else; after receiving the message, the message middleware is responsible for delivering the message synchronously to the corresponding downstream system and triggering the task execution of the downstream system. When the downstream system is processed successfully, feedback the confirmation reply to the message middleware, and the message middleware can delete the message, thus the transaction is completed.

The above is an idealized process, but in a real scenario, the following unexpected situations often occur:

Message middleware failed to deliver message to downstream system. Upstream system failed to send message to message middleware.

For the first case, the message middleware has a retry mechanism. We can set the number of retries and the retry interval in the message middleware. For the failure of message delivery caused by network instability, often the message can be successfully delivered after a few retries. If the delivery fails after exceeding the upper limit of the retry, the message middleware no longer delivers the message, but records it in the failure message table. Message middleware needs to provide a query interface for failure messages, and downstream systems will query failure messages regularly and consume them, which is called "periodic proofreading". If repeated delivery and regular proofreading cannot solve the problem, it is often because there is a serious error in the downstream system, which requires human intervention. In the second case, a message retransmission mechanism needs to be established in the upstream system. You can set up a local message table in the upstream system and complete the task processing and inserting messages into the local message table in a local transaction. If inserting a message into the local message table fails, a rollback is triggered and the previous task processing result is cancelled. If all these steps are performed successfully, the local transaction is completed. Next, a dedicated message sender will continue to send messages in the local message table, and if it fails, it will return to retry. Of course, it is also necessary to set the upper limit for the message sender to retry. Generally speaking, the message sender still fails to reach the retry limit, which means that there is a serious problem with the message middleware, and only human intervention can solve the problem. For message middleware that does not support transactional messages, you can use this approach if you want to implement distributed transactions. It can realize distributed transactions through retry mechanism + periodic proofreading, but compared with the second scheme, it takes a longer period to achieve data consistency, and it also needs to implement a message retry release mechanism in the upstream system to ensure that the message is successfully published to the message middleware, which undoubtedly increases the development cost of the business system and makes the business system not pure enough. And these additional business logic will undoubtedly occupy the hardware resources of the business system, thus affecting the performance. Therefore, try to choose message middleware that supports transactional messages to implement distributed transactions, such as RocketMQ. Scheme 4:TCC (two-phase type, compensated type) TCC is Try Confirm Cancel, which belongs to compensated distributed transaction. As the name implies, there are three steps for TCC to implement a distributed transaction:

Try: try the business to be executed

This process does not execute the business, but only completes the consistency check of all the business, and reserves all the resources needed for execution.

Confirm: performing business

This process really begins to execute the business, and since the consistency check has been completed in the Try phase, the process is executed directly without any checks. And in the process of execution, the business resources reserved in the Try phase are used.

Cancel: canceled business execution

If the business fails, it enters the Cancel phase, which releases all occupied business resources and rolls back the operations performed in the Confirm phase.

The following is an example of a money transfer to explain the process of implementing a distributed transaction in TCC.

Suppose user A sends a red packet of 100 yuan to user B with his account balance, and the balance system and the red packet system are two independent systems.

Try

Create a transfer pipeline and set the status of the pipeline to the transaction. After deducting RMB100 (reserved business resources) Try from user A's account, you will enter the Confirm stage and enter the Cancel phase if any exception occurs in the Try process.

Confirm

Add RMB100 to the B user's red packet account and set the status of the pipeline to the transaction completed. If any exception occurs in the Confirm process, if the Confirm process is successfully executed in the Cancel phase, the transaction ends.

Cancel

Add 100 yuan to user A's account and set the status of the pipeline to transaction failure.

In the traditional transaction mechanism, the execution of business logic and transaction processing are accomplished by different components at different stages: the business logic part accesses resources to realize data storage, and its processing is responsible by the business system; the transaction processing part implements transaction management by coordinating the resource manager, and the transaction manager is responsible for its processing. There is not much interaction between the two, so the transaction logic of the traditional transaction manager only needs to focus on the transaction completion (commit/rollback) phase, not on the business execution phase. TCC global transaction must be based on RM local transaction to realize global transaction TCC service is composed of Try/Confirm/Cancel service, and its Try/Confirm/Cancel service will access Resource Manager (hereinafter referred to as RM) to access data when it is executed. These access operations must participate in RM local transactions so that the changed data is either commit or rollback. This is not difficult to understand. Consider the following scenario:

Assuming that service B in the figure is not based on RM local transactions (for example, RDBS can be simulated by setting auto-commit to true), if the [B:Try] operation fails and the TCC transaction framework subsequently decides to roll back the global transaction, the [B:Cancel] needs to determine which operations in [B:Try] have been written to DB and which operations have not been written to DB: suppose the [B:Try] business has five write library operations. [B:Cancel] the business needs to determine whether the five operations are effective one by one, and reverse the operations that are in effect. Unfortunately, because the [B:Cancel] business also has n (0)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.