What are the solutions for distributed transactions 07/01 Update SLTechnology News&Howtos

What are the solutions for distributed transactions

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

What this article shares to you is about the solutions of distributed transactions. The editor thinks it is very practical, so I share it with you. I hope you can get something after reading this article. Let's take a look at it with the editor.

Transaction, that is, all operations involved in an activity are treated as an indivisible execution unit, providing a "either none or all" mechanism. In a stand-alone system, we can easily implement a transaction processing system that meets the characteristics of ACID, but in a distributed system, the data is scattered on different machines, so it is a great challenge to deal with these data in distributed transactions.

Background

1. A service operates a database

One service, one database.

A service operates a corresponding database, which is the scenario of most of our services. The transaction in this scenario is well guaranteed, and we can make use of the transaction characteristics of the database itself, which we call local transactions.

2. Operate multiple databases with the same service

One service for multiple databases

With the increase of the system function and the increase of the amount of data, we may need to divide the team database, which can be divided into vertical division and horizontal division according to the division mode.

Vertical division

With the increase of service functions, we may put some functional methods in a separate database, or we may directly access the databases of other services, and then we will operate multiple databases, which we call the vertical division of the database. (in fact, this scenario should not exist, because if it is for this service, it should be the same database; if it is other services, then you should not directly operate the databases of other services. Instead, you should call the interfaces provided by other services), so this kind of scenario is not very common.

Horizontal division

The above scenario is not common, but there is another situation that is often encountered, especially when there is a large amount of data. That is, when we divide the database into databases and tables, we need to operate multiple databases at this time, which is called horizontal partition.

3. Micro-service SOA cross-service invocation

Micro service invocation

The above mentioned DB that directly operates on other services is not appropriate. The correct way is to use the interfaces provided by other services.

Transactions in the scenario of "one service operates one database" are called local transactions, and "one service operates multiple databases" and "micro-service SOA cross-service invocation" transactions are called distributed transactions.

For local transactions, we can guarantee transactions by using the transaction characteristics of the database itself, so how can we guarantee transactions in a distributed scenario?

Basics

When it comes to transactions, you must know three rules: ACID,CAP and BASE. Stand-alone systems generally follow ACID rules, while distributed systems generally follow CAP rules or BASE rules.

ACID

ACID is the four basic properties of transactions, which belongs to the common design concept of traditional database and pursues a strong consistency model. ACID means Atomicity (atomicity), Consistency (consistency), Isolation (isolation), Durability (persistence):

A: atomicity (Atomicity)

All operations in a transaction (transaction) are either completed or not completed, and do not end at some point in the middle. An error occurs during the execution of a transaction and is Rollback back to its state before the transaction starts, as if the transaction had never been executed.

C: consistency (Consistency)

Transaction consistency means that the database must be in a consistent state both before and after the execution of a transaction. If the transaction completes successfully, all changes in the system will be applied correctly and the system will be in a valid state. If an error occurs in the transaction, all changes in the system are automatically rolled back and the system returns to its original state.

I: isolated (Isolation)

It means that in a concurrent environment, when different transactions manipulate the same data at the same time, each transaction has its own complete data space. Changes made by concurrent transactions must be isolated from changes made by any other concurrent transactions. When a transaction views data updates, the state of the data is either before another transaction modifies it, or after another transaction modifies it, and the transaction does not view the data in the intermediate state.

D: persistence (Durability)

It means that as long as the transaction ends successfully, the updates it makes to the database must be saved permanently. Even if a system crash occurs, the database can be restored to the state it was when the transaction ended successfully after restarting the database system.

CAP

CAP is an abbreviation for Consistent (consistency), Available (availability), and Partition Tolerance (partition fault tolerance):

C (consistency): for a given client, the read operation returns the latest write operation. For data distributed on different nodes, if the data is updated in one node, then if all other nodes can read the latest data, it is called strong consistency. If a node does not read it, it is distributed inconsistency.

A (availability): non-failed nodes return reasonable responses (not errors and timeouts) within a reasonable period of time. One of the two keys to usability is reasonable time and the other is reasonable response. A reasonable time means that the request cannot be blocked indefinitely and should be returned at a reasonable time. A reasonable response means that the system should explicitly return the results and the results are correct, which means, for example, that 50 should be returned instead of 40.

P (partition fault tolerance): when there is a network partition, the system can continue to work. For example, there are multiple machines in this cluster, and there is a problem with the network of machines, but the cluster can still work properly.

CAP cannot be shared. A distributed system can only satisfy two of the three items: consistency (Consistence), availability (Availability) and partition fault tolerance (Partition tolerance).

CA without P

This situation is almost non-existent in distributed systems. In the distributed environment, with a large number of hosts and decentralized deployment, partition is an inevitable fact. Under the fact that there is a partition, we choose CA over P, so in order to ensure consistency, the request must be rejected at this time, but A does not allow it. Therefore, in a distributed scenario, it is impossible to choose CA architecture, only CP or AP architecture can be selected.

The relational database we are familiar with is the CA architecture, such as My Sql and Oracle to ensure availability and data consistency, but it is not a distributed system. Once the relational database wants to consider the active and standby synchronization, cluster deployment, and so on, P must also be taken into account.

CP without A

If a distributed system does not require strong availability, that is, allowing the system to stop or remain unresponsive for a long time, CP can be guaranteed and A can be abandoned among the three CAP. For a distributed system that guarantees CP and abandons A, in the event of network failure or message loss, it is necessary to sacrifice the user's experience and wait for all the data to be consistent before allowing users to access the system.

Designed as a CP system, the most typical is a lot of distributed databases, they are designed as CP. In extreme cases, priority is given to ensuring strong consistency of data, at the cost of abandoning the availability of the system. Such as Redis, HBase and so on, as well as the Zookeeper commonly used in distributed systems, also choose the priority to guarantee CP among the three CAP. For large-scale financial services, they must ensure that C and P exist at the same time, so they can only sacrifice A.

AP wihtout C

To be highly available and allow partitions, you need to discard consistency. Once a network problem occurs, nodes may lose contact with each other. In order to ensure high availability, it needs to be returned immediately when the user accesses it, so each node can only use local data to provide services, which will lead to global data inconsistency.

There are many scenarios and cases that give up strong consistency and ensure the partition fault tolerance and availability of the system. When we introduced usability earlier, we said that many systems will do a lot of things in usability to ensure that the annual availability of the system can reach N 9s, so for many business systems, such as Taobao shopping, 12306 buy tickets. Both choose usability instead of consistency between usability and consistency.

You must have encountered this situation when you bought a ticket in 12306. When you buy a ticket, you are prompted that you have a ticket (but there may not be any ticket), and you normally enter the CAPTCHA and place an order. But after a while, the system prompts you that you failed to place the order, and there are not enough votes left. In fact, this is to ensure the normal service of the system in terms of availability, and then make some sacrifices in terms of data consistency, which will affect some user experience, but will not cause serious blockage of the user process.

However, we say that many websites sacrifice consistency and choose usability, which is actually not accurate. For example, in the example of buying tickets above, what is actually abandoned is only strong consistency. The second best ensures the ultimate consistency. In other words, although there may be inconsistent data on ticket inventory at the moment of issuing the order, it is necessary to ensure the final consistency after a period of time.

For most large-scale Internet application scenarios, there are a large number of hosts and decentralized deployment, and now the scale of the cluster is getting larger and larger, so node failures and network failures are the norm, and it is necessary to ensure that the service availability reaches N 9s, that is, P and A. abandon C (choose the second best to ensure final consistency). Although some places can affect the customer experience, it is not serious enough to cause the user process.

For AP, abandon consistency (consistency here is strong consistency), pursue partition fault tolerance and availability, this is the choice of many distributed system design, the following BASE is also extended according to AP.

BASE

BASE theory is an extension of CAP theory, the core idea is that even if strong consistency can not be achieved (CAP consistency is strong consistency), the application can use a suitable way to achieve the final consistency.

BASE refers to Basically Available (basic available), Soft state (soft state), and Eventually consistent (final consistency):

Basic availability: when a failure occurs in a distributed system, it is allowed to lose some of the available functions to ensure the availability of core functions.

Soft state: allows an intermediate state in the system that does not affect system availability, referring to inconsistencies in CAP.

Final consistency: final consistency means that after a period of time, all node data will be consistent.

Soft state and final consistency are used in BASE to ensure the consistency after distributed dead network delay. BASE is the opposite of ACID, which is completely different from the strong consistency model of ACID, but achieves availability at the expense of strong consistency, and allows data to be inconsistent for a period of time, but eventually reach a consistent state.

Distributed transaction solution

At present, the solutions of distributed transactions are mainly non-intrusion and intrusion-free solutions, and the non-intrusion solutions are mainly two-stage commit (2PC) scheme based on database XA protocol. Its advantage is that there is no intrusion to business code, but its disadvantages are also obvious: database support for XA protocol must be required, and because of the characteristics of XA protocol, transaction resources will not be released for a long time. The locking cycle is long, and there is no intervention on the application layer, so its performance is very poor, and its existence is equivalent to "hurting seven points and damaging yourself three points", so this kind of solution is not very popular in Internet projects.

In order to make up for the problem of low performance caused by this scheme, many solutions have been come up with, but all of them need to be tampered with in the application layer, that is, to invade the business, such as the famous TCC scheme, and there are many mature frameworks based on TCC, such as ByteTCC, tcc-transaction and so on. And based on the final consistency of reliable messages, such as RocketMQ transaction messages.

The solution of the intrusion code is based on the existing situation. In fact, it is very inelegant to implement. The call of a transaction is usually accompanied by a series of reverse operations on the transaction interface, such as TCC three-stage commit, and the commit logic must be accompanied by the rollback logic. This kind of code will make the project very bloated and expensive to maintain.

2PC

In a distributed system, although each node can know the success or failure of its own operation, it cannot know the success or failure of the operation of other nodes. When a transaction spans multiple nodes, a component as a coordinator needs to be introduced to control the operation results of all nodes (called participants) and finally indicate whether these nodes want to actually commit the operation results (such as writing updated data to disk, etc.). Therefore, the idea of two-stage submission can be summarized as follows: the participants will notify the coordinator of the success or failure of the operation, and then the coordinator will decide whether each participant should submit the operation or abort the operation according to the feedback information of all participants. The so-called two stages are: the first stage: the preparation stage; the second stage: the submission stage.

Preparation stage

The transaction coordinator (transaction manager) sends Prepare messages to each participant (resource manager), and each participant either returns a failure directly (such as permission verification failure), or executes the transaction locally, writing local redo and undo logs, but not committing, reaching a state of "everything is ready, only Dongfeng".

The detailed steps are as follows:

1) the coordinator node asks all participant nodes whether it is possible to perform a submit operation (vote) and starts waiting for a response from each participant node.

2) the participant node performs all transaction operations until the query is initiated, and writes Undo information and Redo information to the log

3) each participant node responds to the query initiated by the coordinator node. If the participant node's transaction operation actually succeeds, it returns a "consent" message; if the participant node's transaction operation actually fails, it returns an "abort" message.

Submission stage

If the coordinator receives a failed message from the participant or times out, a Rollback message is sent directly to each participant; otherwise, a Commit message is sent; the participant performs a commit or rollback operation according to the coordinator's instructions, releasing all lock resources used in the transaction.

Submit through

If all participants return OK during the preparation phase, then the second phase commits.

Submit through

Commit rollback

For all participants, in the preparation phase, as long as one of them returns NO, then the second phase is rolled back.

Commit rollback

Shortcomings of 2PC

Single point of problem: the role of the transaction manager in the entire process is critical. If the transaction manager goes down, such as when the first phase is completed, or when the transaction manager is ready to commit in the second phase, the resource manager will block all the time, causing the database to become unusable.

Synchronous blocking: after it is ready, the resources in the resource manager remain blocked until the commit is complete and the resources are released.

Data inconsistency: although the two-phase commit protocol is designed for strong consistency of distributed data, there is still the possibility of data inconsistency. For example, in the second phase, it is assumed that the coordinator issued a transaction commit notification, but because of network problems, the notification was only received by some participants and performed the commit operation, while the rest of the participants remained blocked because they did not receive the notification. At this time, there is a data inconsistency.

Generally speaking, the XA protocol is relatively simple and low-cost, but its single point problem and inability to support high concurrency (due to synchronization blocking) are still its biggest weaknesses.

TCC

TCC is similar to 2PC two-phase commit mechanism, but the difference lies in the level. 2PC is to solve the distributed transactions between databases at the database level, and TCC is to solve the distributed transactions in the distributed system at the application level.

TCC mode requires transaction participants to implement three operation interfaces: Try, Confirm and Cancel according to their business scenarios. In the first phase, the transaction participants execute the Try method; in the second phase, if it is a commit, the Confirm method is executed; if it is a rollback, the Cancel method is executed:

Try interface: try to execute, complete all business checks (consistency), reserve necessary business resources (quasi-isolation)

Confirm API: confirm that the real business is executed without any business check, and only the business resources reserved in Try phase are used. Confirm requires success, so you need to retry after Confirm failure. Therefore, the Confirm operation needs to be idempotent.

Cancel API: releases the business resources reserved during the Try phase. The failure of the Cancel operation will also be retried all the time, so it also needs to be idempotent.

To take a simple example, if you go to the store to buy a bottle of mineral water for two or 30 yuan, then the participants will pay for you and the shopkeeper, the resources for your money and the boss's water, the operation will pay for you, and the boss will collect money and give you water:

The first phase (try): each checks and resources. For you, you need to check if you have enough 2 yuan in your wallet, and if so, freeze it; for the boss, check to see if there is more mineral water in the fridge, and if so, lock the bottle of water.

The second stage (Confirm): if you and the boss have checked, no problem, then execute the transaction Confirm, you pay two yuan, the boss receives two dollars, and gives you a bottle of mineral water.

The second stage (Cancel): if the operation fails or times out during the inspection, then cancel, you unfreeze your money, and the boss's bottle of water can also be sold to others.

For your side, list the TCC execution process in detail, as follows:

Example of TCC procedure

When users access TCC mode, the most important thing is to consider how to split the business model into two stages and implement three methods of TCC, and to ensure the success of Try Confirm must be successful. Compared with AT mode, TCC mode is intrusive to business code, but TCC mode does not have the global row lock of AT mode, and TCC performance will be much higher than AT mode.

Local message table

The core of this scheme is to asynchronously execute tasks that require distributed processing through message logs. Message logs can be stored in local text, database, or message queues, and retried automatically or manually through business rules. Manual retry is more applied to the payment scenario, through the reconciliation system to deal with post-event problems.

Local message table

For local message queues, the core is to transform large transactions into small transactions. Or take the above example of using 100 yuan to buy a bottle of water.

1) when you deduct money, you need to add a new local message table to the server where you deduct money, and you need to put your deduction and write minus water inventory into the local message table into the same transaction (relying on database local transactions to ensure consistency)

two。 At this time, there is a scheduled task to poll the local transaction table, throw the unsent message to the commodity inventory server, tell him to subtract the water inventory, and write to the server's transaction table at this time after arriving at the commodity server. then the deduction is carried out, and after the deduction is successful, update the status in the transaction table.

3. The commodity server scans the message table through a scheduled task or notifies the deduction server directly, and the local message table of the deduction server updates the status.

4. In view of some abnormal situations, regularly scan the messages that have not been successfully processed and resend them. After the commodity server receives the message, it first judges whether it is repeated. If it has been received, it is judging whether it is executed or not. If it is executed, the notification transaction is carried out immediately. If it is not executed, it needs to be re-executed by the business to guarantee idempotence, that is, it will not deduct an extra bottle of water.

Local message queue is a BASE theory and an ultimate consistent model, which is suitable for those that do not require high consistency. When implementing this model, you need to pay attention to the idempotent of retry.

MQ transaction

The implementation of distributed transactions in RocketMQ is actually an encapsulation of the local message table, which moves the local message table to the inside of MQ. Here is a brief introduction to MQ transactions.

The basic process is as follows:

In the first stage of Prepared message, you will get the address of the message.

The second phase executes the local transaction.

The third stage accesses the message through the address obtained in the first stage and modifies the status. The recipient of the message can use the message.

If the confirmation message fails, regular scanning is provided in RocketMq Broker for messages without update status. If any message is not confirmed, a message will be sent to the message sender to determine whether to submit it or not. In rocketmq, it is given to the sender in the form of listener for processing.

If the consumption times out, it needs to be retried all the time, and the message receiver needs to be idempotent. If the message consumption fails, this needs to be processed manually, because the probability is low, and the loss outweighs the gain if this complex process is designed for such a small probability of time.

Saga

Saga is a concept mentioned in a database ethics article 30 years ago. Its core idea is to split the long transaction into several local short transactions, which is coordinated by the Saga transaction coordinator. If it ends normally, it is completed normally. If a step fails, the compensation operation is called once according to the opposite order.

Each Saga consists of a series of sub-transaction Ti and each Ti has a corresponding compensation action Ci, which is used to undo the results caused by Ti, where each T is a local transaction. As you can see, compared to TCC, Saga does not have a "reserved try" action, and its Ti is submitted directly to the library.

There are two orders in which Saga is executed:

T1, T2, T3,..., Tn

T1, T2,..., Tj, Cj,..., C2, C1, where 0 < j < n Saga defines two recovery strategies:

Reversing backward, the second execution order mentioned above, where j is the sub-transaction where the error occurred, has the effect of undoing all previous successful sub-transation, causing the execution result of the entire Saga to be undone.

Forward recovery, for scenarios that must be successful, the order of execution is similar to this: T1, T2,..., Tj (failed), Tj (retry),..., Tn, where j is the sub-transaction where the error occurred. Ci is not required in this case.

It is important to note that isolation is not guaranteed in saga mode because other transactions can still override or affect the current transaction without locking resources.

Or take the example of buying a bottle of water for 100 yuan, which is defined here.

T1 = deduct 100 yuan T2 = add a bottle of water to the user T3 = reduce the stock of a bottle of water

C1 = add 100 yuan C2 = subtract a bottle of water for the user C3 = add a bottle of water to the inventory

If there is a problem, we perform the reverse of the C operation in which the problem occurs.

The isolation problem mentioned above will occur if the rollback needs to be performed by T3, but the user has already drunk the water (another transaction) and will find that it is impossible to reduce a bottle of water to the user when rolling back. This is the problem of no isolation between transactions.

You can see that the impact of saga mode without isolation is still greater. You can refer to Huawei's solution: from the business level, add a Session and lock mechanism to ensure serialization of operation resources. You can also isolate these resources at the business level by freezing funds in advance, and finally get the latest updates by reading the current status in a timely manner during the business operation.

These are the solutions for distributed transactions, and the editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.