What are the classic solutions for MySQL and Golan distributed transactions 07/01 Update SLTechnology News&Howtos

What are the classic solutions for MySQL and Golan distributed transactions

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article introduces the knowledge of "what are the classic solutions for MySQL and Golan distributed transactions". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. Basic theory

Before explaining the specific scheme, let's take a look at the basic theoretical knowledge involved in distributed transactions.

Let's take the transfer as an example. If A needs to transfer 100 yuan to B, then the balance of A needs to be-100 yuan, and the balance of B needs to be + 100 yuan. The whole transfer should ensure that Amure 100 and Bamboo 100 succeed or fail at the same time. Look at how this problem is solved in a variety of scenarios.

1.1 transaction

The function of operating multiple statements as a whole is called a database transaction. A database transaction ensures that all operations within the scope of the transaction can succeed or fail.

Transactions have four attributes: atomicity, consistency, isolation, and persistence. These four attributes are often referred to as ACID attributes.

Atomicity (atomicity): all operations in a transaction are either completed or not completed and do not end at some point in the middle. An error occurs during the execution of a transaction and is restored to the state it was before the transaction began, as if the transaction had never been executed.

Consistency (consistency): the integrity of the database is not compromised before the transaction starts and after the transaction ends. Integrity, including foreign key constraints, application defined constraints, and other constraints will not be broken.

Isolation (isolation): the ability of a database to allow multiple concurrent transactions to read, write and modify its data at the same time. Isolation can prevent data inconsistencies caused by cross execution when multiple transactions are executed concurrently.

Durability (persistence): after the transaction is completed, the modification of the data is permanent, even if the system failure will not be lost.

If our business system is not complex, we can modify the data and complete the transfer in a database or a service, then we can make use of database transactions to ensure the correct completion of the transfer business.

1.2 distributed transactions

The inter-bank transfer business is a typical distributed transaction scenario. If A needs to transfer money across banks to B, then the data involving two banks can not be guaranteed by the local transaction of a database, and the ACID can only be solved through distributed transactions.

Distributed transaction means that the initiator, resource and resource manager and transaction coordinator of the transaction are located on different nodes of the distributed system. In the above-mentioned money transfer service, the user Amur100 operation and the user Bamboo 100 operation are not located on the same node. In essence, distributed transaction is to ensure the correct execution of data operations in distributed scenarios.

Distributed transactions in a distributed environment, in order to meet the needs of availability, performance and degraded services, and to reduce the requirements of consistency and isolation, on the one hand, we follow the BASE theory (BASE theory, which involves a lot of content. Students who are interested can refer to BASE theory):

Basic Business availability (Basic Availability)

Flexible state (Soft state)

Final consistency (Eventual consistency)

Similarly, distributed transactions partially follow the ACID specification:

Atomicity: strictly follow

Consistency: the consistency after the completion of the transaction is strictly followed; the consistency in the transaction can be appropriately relaxed.

Isolation: non-interference between parallel transactions; visibility of intermediate results of transactions allows security relaxation

Persistence: strictly follow

2. The solution of distributed transaction

Due to the distributed transaction scheme, the guarantee of complete ACID can not be achieved, and there is no perfect solution that can solve all business problems. Therefore, in practical application, the most suitable distributed transaction scheme will be selected according to the different characteristics of the business.

2.1 two-phase commit / XA

XA is a distributed transaction specification proposed by X/Open. XA specification mainly defines the interface between (global) transaction manager (TM) and (local) resource manager (RM). Local databases such as mysql play the role of RM in XA

XA is divided into two phases:

The first phase (prepare): the RM of all participants prepares to execute the transaction and locks the required resources. When the participant ready, report to TM that you are ready.

The second phase (commit/rollback): when the transaction manager (TM) confirms that all participants (RM) are ready, a commit command is sent to all participants.

At present, most mainstream databases support XA transactions, including mysql, oracle, sqlserver, and postgre.

XA transactions consist of one or more resource managers (RM), a transaction manager (TM), and an application (ApplicationProgram).

The three roles of RM, TM and AP are the classic role division, which will run through the subsequent transaction modes such as Saga, Tcc and so on.

Taking the above transfer as an example, the sequence diagram of a successfully completed XA transaction is as follows:

If any of the participants' prepare fails, TM notifies all participants who have completed the prepare to roll back.

The characteristics of XA transactions are:

Simple and easy to understand, easy to develop

The resources are locked for a long time, and the concurrency is low.

If readers want to further study the XA,go language, as well as PHP, Python, Java, C#, Node, etc., you can refer to DTM.

2.2 SAGA

Saga is a scheme mentioned by sagas in this database paper. Its core idea is to split the long transaction into several local short transactions, which is coordinated by the Saga transaction coordinator. If it ends normally, it is completed normally. If a step fails, the compensation operation is called once according to the opposite order.

Taking the above transfer as an example, the sequence diagram of a successfully completed SAGA transaction is as follows:

Once Saga reaches the Cancel stage, Cancel is not allowed to fail in business logic. If a failure occurs due to a network or other temporary failure, TM will try again and again until Cancel returns success.

Characteristics of Saga transactions:

High concurrency and no need to lock resources for a long time like XA transactions

Normal operation and compensation operation need to be defined, and the development volume is larger than that of XA.

The consistency is weak, and for the transfer, it may occur that the A user has deducted the money and finally the transfer fails.

There are a lot of SAGA in this paper, including two recovery strategies, including concurrent execution of branch transactions. Our discussion here only includes the simplest SAGA.

SAGA is suitable for many scenarios, long transactions, and business scenarios that are not sensitive to intermediate results.

If readers want to further study SAGA, you can refer to DTM, which includes examples of SAGA success and failure rollback, as well as the handling of various network exceptions.

2.3 TCC

The concept of TCC (Try-Confirm-Cancel) was first put forward by a paper called "Life beyond Distributed Transactions:an Apostate's Opinion" published by Pat Helland in 2007.

TCC is divided into three phases:

Try phase: try to execute, complete all business checks (consistency), reserve necessary business resources (quasi-isolation)

Confirm phase: confirm that the real business is executed without any business check, and only the business resources reserved in Try phase are used. Confirm operation requires idempotent design. If Confirm fails, you need to retry.

Cancel phase: cancel the execution and release the business resources reserved in the Try phase. The exception handling scheme of Cancel phase is basically the same as that of Confirm phase, and it is required to meet the idempotent design.

Take the above transfer as an example. The amount is usually frozen in Try, but not deducted, deducted in Confirm, and unfrozen in Cancel.

The sequence diagram of a successfully completed TCC transaction is as follows:

The Confirm/Cancel phase of TCC does not allow you to return failure in business logic. If you cannot return success due to network or other temporary failure, TM will retry again and again until Confirm/Cancel returns success.

TCC has the following characteristics:

The concurrency is high and there is no long-term resource locking.

Because of the large amount of development, it is necessary to provide Try/Confirm/Cancel interface.

The consistency is good, and it will not happen that the SAGA has been deducted and the final transfer fails.

TCC is suitable for order-type businesses with intermediate state constraints.

If readers want to further study TCC, please refer to DTM

2.4 Local message table

The local message table solution was originally published to ACM by ebay architect Dan Pritchett in 2008. The core of the design is to asynchronously ensure the execution of tasks that require distributed processing through messages.

The general process is as follows:

Writing local messages and business operations are placed in one transaction, ensuring the atomicity of the business and sending messages, either they all succeed or they all fail.

Fault tolerance mechanism:

When the balance deduction transaction fails, the transaction rolls back directly and there are no next steps.

Failure to sequence production messages and failure to increase balance transactions will retry.

Characteristics of the local message table:

Long transactions only need to be split into multiple tasks, so it is easy to use

Producers need to create additional message tables

Each local message table needs to be polled

If the consumer's logic does not succeed through the retry, then more mechanisms are needed to roll back the operation.

Suitable for services that can be executed asynchronously and subsequent operations do not need to be rolled back

2.5 transaction messages

In the above local message table scheme, the producer needs to create an additional message table and poll the local message table, which is a heavy business burden. The version after Ali's open source RocketMQ 4.3 officially supports transaction messages, which essentially put the local message table on RocketMQ to solve the atomicity problem of message sending and local transaction execution on the production side.

Transaction message sending and submission:

Send messages (half messages)

The server stores the message and responds to the written result of the message

Execute a local transaction based on the result of the delivery (if the write fails, the half message is not visible to the business and the local logic is not executed)

Execute Commit or Rollback based on local transaction status (Commit operation publishes messages that are visible to consumers)

The flow chart of normal transmission is as follows:

Compensation process:

Initiate a "backcheck" from the server for transaction messages without Commit/Rollback (messages of pending status)

Producer receives the check message and returns the status of the local transaction corresponding to the message, which is Commit or Rollback.

The transaction message scheme is very similar to the local message table mechanism, mainly in that the original related local table operation is replaced by a reverse lookup interface.

Transaction messages have the following characteristics:

Long transactions only need to be split into multiple tasks and provide a reverse lookup interface, which is easy to use.

If the consumer's logic does not succeed through the retry, then more mechanisms are needed to roll back the operation.

Suitable for services that can be executed asynchronously and subsequent operations do not need to be rolled back

2.6 Best effort Notification

The initiator tries his best to notify the receiver of the business processing result through a certain mechanism. The details include:

There is a mechanism for repeated notification of certain messages. Because the recipient may not receive the notification, there should be a mechanism to repeat the notification to the message.

Message proofreading mechanism. If the receiver does not notify the receiver with his best efforts, or the receiver consumes the message again after consuming the message, the receiver can take the initiative to query the notifier for message information to meet the demand.

The local message table and transaction messages described earlier are reliable messages, how is it different from the best effort notification introduced here?

Reliable message consistency, the initiating notifier needs to ensure that the message is sent and sent to the receiving notifier, and the reliability of the message is guaranteed by the initiating notifier.

Best effort notification, the initiating notifier tries his best to notify the receiving party of the business processing result, but the message may not be received. In this case, the receiving party needs to actively call the initiating party's interface to query the business processing result. The key to the reliability of the notification lies in the receiving party.

On the solution, best effort notification requires:

Provide an interface to enable the receiving notification to query the business processing results through the interface

Message queuing ACK mechanism, message queuing gradually enlarges the notification interval according to the interval of 1min, 5min, 10min, 30min, 1h, 2h, 5h, 10h until the upper limit of the time window required for notification is reached. And then no more notice.

Best effort notification applies to the type of business notification. For example, the results of Wechat transactions are notified to merchants through best effort notification. There are both callback notifications and transaction query APIs.

2.7 AT transaction mode

This is a transaction mode in Ali's open source project seata, also known as FMT in Ant Financial Services Group. The advantage is that the transaction mode is used in a way similar to the XA mode, the business does not need to write all kinds of compensation operations, and the rollback is completed automatically by the framework, and the disadvantage is also similar to XA. It has locks for a long time and does not meet the high concurrency scenarios. From a performance perspective, AT mode is higher than XA, but it also brings new problems such as dirty rollbacks. Students who are interested can refer to seata-AT

3. Exception handling

Problems such as network and business failures may occur in all aspects of the distributed transaction, which require the business side of the distributed transaction to achieve the three characteristics of air defense rollback, idempotent and anti-suspension.

3.1 abnormal situation

These exceptions are illustrated in terms of TCC transactions:

Empty rollback:

Without calling the TCC resource Try method, the two-phase Cancel method is called, and the Cancel method needs to recognize that this is an empty rollback and return the success directly.

The reason is that when a branch transaction fails when the service is down or the network is abnormal, the call to the branch transaction is recorded as a failure. In fact, the Try phase is not executed at this time. When the fault is restored, the distributed transaction rollback will call the two-phase Cancel method, thus forming an empty rollback.

Idempotent:

Since any request may have network exceptions and duplicate requests, all distributed transaction branches need to be idempotent.

Suspension:

Suspension is a distributed transaction in which the Cancel interface executes before the Try interface in the second phase.

The reason is that when RPC calls the branch transaction try, it registers the branch transaction first, and then executes the RPC call. If the network called by RPC is congested and RPC times out, TM will notify RM to roll back the distributed transaction. It is only after the rollback is completed that the RPC request of Try reaches the participant for real execution.

When the business processes request 4, the Cancel executes before Try, and the empty rollback needs to be handled.

When the business processes request 6, the Cancel executes repeatedly, requiring idempotency

When the business processes request 8, the Try executes after the Cancel and needs to handle the suspension

In the face of the above complex network anomalies, we can see that the proposed solutions are that the business side uses a unique key to query whether the associated operation has been completed, and if it has been completed, it will directly return success. The relevant judgment logic is complex, error-prone, and the business burden is heavy.

3.2 Sub-transaction barrier

In the project https://github.com/yedf/dtm, there is a seed transaction barrier technology that can achieve this effect, as shown in the diagram:

All of these requests are behind the sub-transaction barrier: abnormal requests are filtered; normal requests pass through the barrier. After the developers use the sub-transaction barrier, all the exceptions mentioned above are handled properly, and the business developers only need to pay attention to the actual business logic, which greatly reduces the burden.

The subtransaction barrier provides the method ThroughBarrierCall, the prototype of which is:

Func ThroughBarrierCall (db * sql.DB, transInfo * TransInfo, busiCall BusiFunc)

Business developers, who write their own logic in busiCall, call this function. ThroughBarrierCall guarantees that in scenarios such as empty rollback and suspension, busiCall will not be called; when the business is called repeatedly, there is idempotent control to ensure that it will only be submitted once.

The subtransaction barrier manages TCC, SAGA, transaction messages, etc., and can also be extended to other areas

3.3 principle of sub-transaction barrier

The principle of the sub-transaction barrier technology is that the branch transaction status table sub_trans_barrier is established in the local database, and the unique key is the global transaction id- sub-transaction id- sub-transaction branch name (try | confirm | cancel).

Open a transaction

If it is a Try branch, then the insert ignore is inserted into the gid-branchid-try, and if it is successfully inserted, the in-barrier logic is invoked.

If it is a Confirm branch, then the insert ignore is inserted into the gid-branchid-confirm, and if it is successfully inserted, the in-barrier logic is called.

If it is a Cancel branch, the insert ignore inserts the gid-branchid-try and then the gid-branchid-cancel. If the try is not inserted and the cancel is inserted successfully, the in-barrier logic is called.

The logic within the barrier returns successfully, commits the transaction and returns success.

The logic within the barrier returns errors, rolls back transactions, returns errors

Under this mechanism, the problems related to network anomalies are solved.

Null compensation control-if the Try is not executed and the Cancel is executed directly, then the Cancel inserts the gid-branchid-try successfully without removing the logic in the barrier and ensures the empty compensation control.

Idempotent control-no branch can repeatedly insert a unique key, ensuring that it will not be repeated

Anti-suspension control-if the Try is executed after the Cancel, if the inserted gid-branchid-try is not successful, it will not be implemented, which ensures the anti-suspension control.

It is a similar mechanism for SAGA, transaction messages, and so on.

3.4 Summary of sub-transaction barriers

Sub-transaction barrier technology, pioneered for https://github.com/yedf/dtm, its significance lies in designing simple and easy-to-implement algorithms and providing easy-to-use interfaces. In the first creation, its significance lies in designing simple and easy-to-implement algorithms and providing easy-to-use interfaces. With the help of these two items, developers are completely liberated from the handling of network exceptions.

At present, this technology needs to be equipped with yedf/dtm transaction manager, and SDK has been provided to developers of Go and Python languages. Sdk for other languages is being planned. For other distributed transaction frameworks, as long as appropriate distributed transaction information is provided, the technology can be quickly implemented according to the above principles.

4. Distributed transaction practice

We take the SAGA transaction introduced earlier as an example, and use DTM as the transaction framework to complete a specific distributed transaction. This example is in the GE language, and if you are not interested, you can skip to the summary at the end of the article.

4.1 one SAGA transaction

Let's first write the core business code to adjust the user's account balance.

Func qsAdjustBalance (uid int, amount int) (interface {}, error) {_, err: = dtmcli.SdbExec (sdbGet (), "update dtm_busi.user_account set balance = balance +? where user_id =?", amount, uid) return dtmcli.ResultSuccess, err}

Let's write specific handlers for forward / compensation operations.

App.POST (qsBusiAPI+ "/ TransIn", common.WrapHandler (func (c * gin.Context) (interface {}, error) {return qsAdjustBalance (2,30)}) app.POST (qsBusiAPI+ "/ TransInCompensate", common.WrapHandler (func (c * gin.Context) (interface {}, error) {return qsAdjustBalance (2,-30)}) app.POST (qsBusiAPI+ "/ TransOut", common.WrapHandler (func (c * gin.Context) (interface {}) Error) {return qsAdjustBalance (1,-30)}) app.POST (qsBusiAPI+ "/ TransOutCompensate", common.WrapHandler (func (c * gin.Context) (interface {}, error) {return qsAdjustBalance (1,30)}))

At this point, the handling functions of each subtransaction have been OK, and then the SAGA transaction is opened and branch calls are made.

Req: = & gin.H {"amount": 30} / / the payload of the microservice / / DtmServer is the address of the DTM service saga: = dtmcli.NewSaga (DtmServer, dtmcli.MustGenGid (DtmServer)). / / add a child transaction of TransOut. The forward operation is url: qsBusi+ "/ TransOut", and the reverse operation is url: qsBusi+ "/ TransOutCompensate" Add (qsBusi+ "/ TransOut", qsBusi+ "/ TransOutCompensate", req). / / add a sub-transaction of TransIn, forward operation is url: qsBusi+ "/ TransOut", reverse operation is url: qsBusi+ "/ TransInCompensate" Add (qsBusi+ "/ TransIn", qsBusi+ "/ TransInCompensate", req) / / commit saga transaction, dtm will complete all sub-transactions / roll back all sub-transactions err: = saga.Submit ()

At this point, a complete SAGA distributed transaction is written.

If you want to run a successful example in its entirety, after setting up the environment according to the instructions of the yedf/dtm project, run the saga example with the following command:

Go run app/main.go quick_start

4.2 handling network exceptions

What if there is a brief failure when the call is transferred to the operation in a transaction committed to dtm? According to the protocol of the SAGA transaction, dtm will retry the outstanding operation. What should we do then? The failure may be caused by a network failure after the completion of the transfer operation, or it may be caused by machine downtime during the completion of the transfer operation. How to deal with it in order to ensure that the adjustment of the account balance is correct?

We use the sub-transaction barrier feature to ensure that there are multiple retries and only one successful commit.

We adjust the handler to:

Func sagaBarrierAdjustBalance (sdb * sql.Tx, uid int, amount int) (interface {}, error) {_, err: = dtmcli.StxExec (sdb, "update dtm_busi.user_account set balance = balance +? Where user_id =?, amount, uid) return dtmcli.ResultSuccess, err} func sagaBarrierTransIn (c * gin.Context) (interface {}, error) {return dtmcli.ThroughBarrierCall (sdbGet (), MustGetTrans (c), func (sdb * sql.Tx) (interface {}, error) {return sagaBarrierAdjustBalance (sdb, 1, reqFrom (c) .Amount)} func sagaBarrierTransInCompensate (c * gin.Context) (interface {}, error) {return dtmcli.ThroughBarrierCall (sdbGet (), MustGetTrans (c)) Func (sdb * sql.Tx) (interface {}, error) {return sagaBarrierAdjustBalance (sdb, 1,-reqFrom (c) .amount)})}

The dtmcli.TroughBarrierCall call here uses the sub-transaction barrier technique to ensure that the callback function in the third parameter is processed only once.

You can try to invoke the TransIn service multiple times with only one balance adjustment. You can run the following command to run the new treatment:

Go run app/main.go saga_barrier

4.3 handle rollback

What if the bank finds that the account of user 2 is abnormal and the return fails when the amount is ready to be transferred to user 2? We adjust the handler so that the transfer operation returns a failure.

Func sagaBarrierTransIn (c * gin.Context) (interface {}, error) {return dtmcli.ResultFailure, nil}

We give the sequence diagram of the transaction failure interaction.

The point here is that the forward operation of TransIn fails without doing anything. Will calling the compensation operation of TransIn at this time lead to an error in reverse adjustment?

Don't worry, the previous sub-transaction barrier technology ensures that if the error of TransIn occurs before commit, the compensation will be null; if the error of TransIn occurs after submission, the compensation operation will submit the data once; if TransIn is still in progress, the compensation operation will wait for TransIn to finally commit / rollback, and then submit compensation / empty rollback.

You can change the TransIn that returns the error to:

Func sagaBarrierTransIn (c * gin.Context) (interface {}, error) {dtmcli.ThroughBarrierCall (sdbGet (), MustGetTrans (c), func (sdb * sql.Tx) (interface {}, error) {return sagaBarrierAdjustBalance (sdb, 1,30)}) return dtmcli.ResultFailure, nil}

In the end, there is no problem with the balance.

This is the end of the introduction of "what are the classic solutions for MySQL and Golan distributed transactions". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.