The cause of Database distributed transaction 07/11 Update SLTechnology News&Howtos

The cause of Database distributed transaction

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "the causes of database distributed transactions". The content of the explanation in this article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought. Let's study and learn the causes of database distributed transactions.

Database distributed transaction

Distributed transaction means that the participants of the transaction, the server supporting the transaction, the resource server and the transaction manager are located on different nodes of different distributed systems. The above is the explanation of Baidu Encyclopedia. To put it simply, a large operation consists of different small operations, which are distributed on different servers and belong to different applications. Distributed transactions need to ensure that these small operations either succeed or fail. In essence, distributed transaction is to ensure the data consistency of different databases.

ACID characteristics of transactions

An atomicity: all operations in the entire transaction are either completed or not done at all, with no intermediate state. When an error occurs in the execution of a transaction, all operations are rolled back as if the entire transaction had never been executed.

C consistency: the execution of the transaction must ensure the consistency of the system. Take the transfer as an example. If A has 500 yuan and B has 300 yuan, if An is successfully transferred to B50 in a transaction, then no matter how much concurrency, no matter what happens, as long as the transaction is executed successfully, then the final An account must be 450 yuan and the B account must be 350 yuan.

I isolation: the so-called isolation means that transactions do not affect each other, and the intermediate state of one transaction is not perceived by other transactions.

D persistence: the so-called persistence means that when a single transaction is completed, the changes made by the transaction to the data are completely saved in the database, even if there is a power outage and the system goes down.

Distributed theory

When there is a bottleneck in the performance of our single database, we may partition the database. The partition here refers to the physical partition. After the partition, different libraries may be on different servers. At this time, the ACID of a single database can no longer adapt to this situation. In this ACID cluster environment, it is almost difficult to ensure that the ACID of the cluster can be achieved. Or even if the efficiency and performance can be greatly reduced, the most important thing is that it is very difficult to expand new partitions. At this time, if we pursue the ACID of the cluster, our system will become very poor. Then we need to introduce a new theoretical principle to adapt to the situation of this cluster, that is, the CAP principle or the CAP theorem. What does the CAP theorem refer to?

CAP theorem

The CAP theorem was proposed by Professor Eric Brewer of the University of California, Berkeley, who pointed out that WEB services cannot satisfy the following three attributes at the same time:

Consistency: the client knows that a series of operations will occur at the same time (effective)

Availability (Availability): each operation must end with a predictable response

Partition fault tolerance (Partition tolerance): the operation can be completed even if a single component is not available

Specifically, in a distributed system, in any database design, a Web application can only support at most two of the above attributes. Obviously, any scale-out strategy depends on data partitioning. Therefore, the designer must choose between consistency and usability.

This theorem is applicable to all distributed systems so far! Why would you say that?

At this time, some students may bring out the 2PC (two-phase commit) of the database to speak. OK, let's take a look at the two-phase commit of the database.

Students who have knowledge of database distributed transactions must know the 2PC supported by the database, also known as XA Transactions.

MySQL has been supported since version 5.5, SQL Server 2005, and Oracle 7.

Among them, XA is a two-phase commit protocol, which is divided into the following two phases:

The first phase: the transaction coordinator requires each database involved in the transaction to pre-commit (precommit) this operation and reflects whether it can be committed.

The second phase: the transaction coordinator requires each database to submit data.

Among them, if any database rejects the commit, all databases will be required to roll back that part of their information in the transaction. What is the flaw in doing so? At first glance, we can get consistency between database partitions.

If the CAP theorem is true, then it must affect usability.

If the availability of the system represents the sum of the availability of all components related to the execution of an operation. Then during the two-phase commit process, availability represents the sum of availability in each database involved. Let's assume that each database has 99.9% availability during a two-phase commit, so if two databases are involved in the two-phase commit, the result is 99.8%. According to the system availability calculation formula, assuming 43200 minutes per month, 99.9% availability is 43157 minutes, 99.8% availability is 43114 minutes, which is equivalent to an increase of 43 minutes per month downtime.

Above, it can be verified that the CAP theorem is theoretically correct. CAP, we'll see here first, and we'll talk about it later.

BASE theory

In distributed systems, what we often pursue is availability, and its important programs are higher than consistency, so how to achieve high availability? Predecessors have put forward another theory for us, that is, BASE theory, which is used to further extend the CAP theorem. BASE theory refers to:

Basically Available (basic available)

Soft state (soft state)

Eventually consistent (final consistency)

BASE theory is the result of a tradeoff between consistency and usability in CAP. The core idea of the theory is: we can not achieve strong consistency, but each application can make the system achieve final consistency (Eventual consistency) in an appropriate way according to its own business characteristics.

With the above theory, let's look at the problem of distributed transactions.

The cause of distributed transaction database sub-database and sub-table

When the data produced by a single table in a database exceeds 1000W in a year, then it is necessary to consider sub-database and sub-table. The specific principle of sub-database and sub-table is not explained here. In the future, it is free to say in detail that the original database has become multiple databases. At this time, if an operation accesses both 01 and 02 libraries, and wants to ensure data consistency, then a distributed transaction is used.

Application of SOA

The so-called SOA is the service of the business. For example, the original stand-alone supported the entire e-commerce site, now the entire site is dismantled, separated from the order center, user center, inventory center. For the order center, there is a special database to store order information, the user center also has a special database to store user information, and the inventory center also has a special database to store inventory information. At this time, if you want to operate the order and inventory at the same time, then it will involve the order database and inventory database, in order to ensure data consistency, you need to use distributed transactions.

The appearance of the above two cases is different, but the essence is the same, both because there are more databases to operate!

Distributed transaction solution

Two-phase commit based on XA protocol

XA is a distributed transaction protocol proposed by Tuxedo. XA is roughly divided into two parts: transaction manager and local resource manager. Among them, the local resource manager is often implemented by the database, such as Oracle, DB2 and other commercial databases all implement the XA interface, while the transaction manager, as the global scheduler, is responsible for the commit and rollback of local resources. The principles of XA to implement distributed transactions are as follows:

Generally speaking, the XA protocol is relatively simple, and once the commercial database implements the XA protocol, the cost of using distributed transactions is relatively low. However, XA also has a fatal disadvantage, that is, the performance is not ideal, especially in the transaction order link, the concurrency is often very high, XA can not meet the high concurrency scenario. At present, the support of XA in commercial database is ideal, but it is not ideal in mysql database. The XA implementation of mysql does not record the log of prepare phase, and switching back between master and slave leads to data inconsistency between master database and slave database. Many nosql also do not support XA, which makes the application scenario of XA very narrow.

Advantages and disadvantages

Advantages: try to ensure the strong consistency of data, which is suitable for key areas with high requirements for strong consistency of data. (in fact, there is no 100% guarantee of strong consistency.)

Disadvantages: complex implementation, sacrifice availability, have a great impact on performance, and are not suitable for high-concurrency and high-performance scenarios.

Message transaction + final consistency

The so-called message transaction is a two-phase commit based on message middleware, which is essentially a special use of message middleware. It places local transactions and sending messages in a distributed transaction to ensure that either local operations are successful and outgoing messages are successful, or both fail. Open source RocketMQ supports this feature. The specific principles are as follows:

1. System A sends a preparatory message to the message middleware

2. Message middleware saves the prepared message and returns success.

3. An executes local transactions

4. A sends a submission message to the message middleware

A message transaction is completed through the above four steps. For each of the above four steps, errors may occur. Analyze them one by one:

If there is an error in step 1, the whole transaction fails and the local operation of An is not performed.

If an error occurs in step 2, the whole transaction fails and the local operation of An is not performed.

There is an error in step 3. At this time, you need to roll back the preparation message. How to roll back? The answer is that system An implements a callback interface of message middleware, and the message middleware will continue to execute the callback interface to check whether the execution of A transaction is successful, and if it fails, roll back the prepared message.

Step 4 error, at this time A's local transaction is successful, so does the message middleware want to roll back A? The answer is no. In fact, through the callback interface, the message middleware can check that A has successfully executed. At this time, there is no need for A to send and submit messages. Message middleware can commit the message itself, thus completing the entire message transaction.

Two-phase commit based on message middleware is often used in high concurrency scenarios to split a distributed transaction into a message transaction (local operation of system A + sending message) + local operation of system B, where the operation of system B is message-driven. As long as the message transaction is successful, then the An operation must be successful and the message must be sent. At this time, B will receive the message to perform the local operation, if the local operation fails. The message is reinvested until the B operation is successful, thus realizing the distributed transaction between An and B in disguise. The principle is as follows:

Although the above scheme can complete the operation of An and B, but An and B are not strictly consistent, but ultimately consistent, we sacrifice consistency here in exchange for a significant improvement in performance. Of course, this kind of play also has risks, if B has been unsuccessful, then the consistency will be broken, whether to play or not depends on how much risk the business can bear.

Local message table (asynchronously guaranteed)

The implementation of local message table should be the most widely used in the industry, and its core idea is to split the distributed transaction into local transactions for processing. This idea comes from ebay. We can see some of the details in the following flowchart:

The basic idea is:

The message producer needs to build an additional message table and record the message delivery status. Message tables and business data are committed in a transaction, that is, they are in a database. The message is then sent to the consumer of the message via MQ. If the message fails to send, it will retry to send.

The message consumer needs to process the message and complete its own business logic. At this point, if the local transaction is successful, it indicates that the processing has been successful, and if the processing fails, the execution will be retried. If it is a failure of the business, you can send a business compensation message to the manufacturer to notify the manufacturer to roll back and other operations.

The producer and consumer scan the local message table regularly and send the unfinished message or failed message again. If there is a reliable automatic reconciliation logic, this scheme is still very practical.

This solution follows the BASE theory and adopts the ultimate consistency. The author believes that one of these solutions is more suitable for the actual business scenario, that is, there will not be a complex implementation like 2PC (when the call chain is very long, the availability of 2PC is very low), and it is not possible to confirm or rollback like TCC.

Advantages: a very classic implementation that avoids distributed transactions and achieves ultimate consistency.

Disadvantages: message tables are coupled to the business system, and if there is no encapsulated solution, there will be a lot of chores to deal with.

MQ transaction message

There are some third-party MQ that support transaction messages, such as RocketMQ, which supports transaction messages in a way similar to the two-phase commit adopted, but some mainstream MQ in the market do not support transaction messages, such as RabbitMQ and Kafka.

Take Ali's RocketMQ middleware as an example, the idea is roughly as follows:

In the first stage of Prepared message, you will get the address of the message.

The second phase executes the local transaction, and the third phase accesses the message and modifies the state through the address obtained in the first stage.

That is, within the business method, you have to submit two requests to the message queue, one to send a message and one to confirm. If the delivery of the confirmation message fails, RocketMQ will periodically scan the transaction messages in the message cluster, and when the Prepared message is found, it will confirm to the message sender, so the producer needs to implement a check interface, and RocketMQ will decide whether to roll back or continue to send the confirmation message according to the policy set by the sender. This ensures that message delivery succeeds or fails at the same time as the local transaction.

Advantages: ultimate consistency is achieved without relying on local database transactions.

Disadvantages: difficult to implement, mainstream MQ does not support, there is no .NET client, RocketMQ transaction message part of the code is not open source.

TCC programming mode

The so-called TCC programming mode is also a variation of two-phase commit. TCC provides a programming framework that divides the entire business logic into three parts: Try, Confirm and Cancel. For example, if you place an order online, the Try phase will deduct the inventory, and the Confirm phase will update the order status. If you fail to update the order, you will enter the Cancel phase and restore the inventory. In short, TCC implements two-phase commit artificially through code, and different business scenarios write different code and complexity, so this pattern can not be well reused.

The core idea of the compensation mechanism adopted by TCC is to register a corresponding confirmation and compensation (revocation) operation for each operation. It is divided into three stages:

The Try phase is mainly about testing the business system and reserving resources.

The main purpose of the Confirm phase is to confirm the submission of the business system. When the Try phase is executed successfully and the Confirm phase is started, the default Confirm phase will not go wrong. That is, as long as Try succeeds, Confirm will succeed.

The Cancel phase mainly cancels the business executed in the case of business execution error and needs to be rolled back, and reserves resources to be released.

For example, if you want to transfer money to Smith when you enter Bob, the idea is probably:

We have a local method that calls the

1. First of all, in the Try phase, you need to call the remote interface to freeze the money of Smith and Bob.

2. In the Confirm phase, the remote call transfer operation is performed, and the transfer is thawed successfully.

3. If the second step is successful, the transfer is successful. If the second step fails, the thawing method (Cancel) corresponding to the remote freeze API is called.

Advantages and disadvantages

Pros: compared with 2PC, the implementation and process are relatively simple, but the data consistency is also worse than 2PC.

Disadvantages: the shortcomings are still quite obvious, and it is possible to fail in the 3 steps of 2Jing. TCC is a compensation method in the application layer, so programmers need to write a lot of compensation code when implementing it. In some scenarios, some business processes may not be easy to define and handle with TCC.

Summary

Distributed transaction, in essence, is the unified control of multiple database transactions, which can be divided into non-control, partial control and complete control according to the intensity of control. No control means no introduction of distributed transactions, partial control is the two-phase commit of various variants, including the message transaction + final consistency and TCC mode mentioned above, while complete control is the full realization of two-phase commit. The advantage of partial control is that the concurrency and performance are good, but the disadvantage is that data consistency is weakened, while full control sacrifices performance to ensure consistency. Which way is used ultimately depends on the business scenario. As a technical staff, we must not forget that technology is for business services, do not technology for the sake of technology, for different business technology selection is also a very important ability!

Summarize the solution:

1. The final consistency of reliable messages combined with MQ message middleware.

II. TCC compensatory transaction solution

III. Best effort notification scheme

The first scheme: the final consistency of reliable messages, which needs to be implemented by the business system combined with MQ message middleware, and the successful delivery and consumption of messages need to be ensured in the process of implementation. That is, the message status of MQ needs to be controlled through the business system.

The second scheme: TCC transaction compensation, which is divided into three stages of TRYING-CONFIRMING-CANCELING. Each stage is dealt with differently.

The TRYING phase is mainly about business system detection and resource reservation.

The CONFIRMING phase is to make a business submission. After the successful execution of the TRYING phase, this phase is executed. By default, if the TRYING phase is successful, CONFIRMING is sure to succeed.

The CANCELING phase is to roll back the business. In the TRYING phase, if there is a branch transaction TRYING failure, you need to call CANCELING to release the reserved resources.

All of the above operations need to be idempotent. Idempotency can be realized by:

1. Handle with the unique key value, that is, pass in the unique key value each time, and judge whether the business is operated by the unique key value. If it has been operated, the operation will not be repeated.

2. Set the status of the business data through the state machine processing, and judge whether repeated execution is required by the business state.

The third scheme: best effort notification (notify according to the law, there is no guarantee that the data will be notified successfully, but will provide a searchable operation interface for checking) this scheme is mainly used when communicating with third-party systems. for example: call Wechat or Alipay payment result notice. This scheme is also implemented with MQ, for example, sending http requests through MQ and setting the maximum number of notifications. When the number of notifications is reached, there will be no further notification.

Thank you for your reading. the above is the content of "the causes of database distributed transactions". After the study of this article, I believe you have a deeper understanding of the causes of database distributed transactions. The specific use of the situation also needs to be verified by practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.