Introduction to two-phase commit of distributed database transaction 04/15 Update SLTechnology News&Howtos

Introduction to two-phase commit of distributed database transaction

2025-04-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

In the distributed system, each node is physically independent of each other and communicates and coordinates through the network. Due to the existence of transaction mechanism, the data operations on each independent node can be guaranteed to meet the ACID. However, independent nodes can not accurately know the execution of transactions in other nodes. So in theory, the two machines cannot reach a consistent state in theory. If you want to keep data consistent across multiple machines in a distributed deployment, make sure that data writes on all nodes are either performed or not performed at all. However, when one machine executes a local transaction, it cannot know the execution result of the local transaction in another machine. So he doesn't know whether this transaction should be commit or roolback. Therefore, the conventional solution is to introduce a "coordinator" component to schedule the execution of all distributed nodes in a unified manner.

2PC

Two-phase commit (Two-phaseCommit) is an algorithm (Algorithm) designed to make all nodes based on distributed system architecture consistent when committing transactions in the field of computer network and database. In general, two-phase commit is also referred to as a protocol (Protocol). In a distributed system, although each node can know the success or failure of its own operation, it cannot know the success or failure of the operation of other nodes. When a transaction spans multiple nodes, in order to maintain the ACID characteristics of the transaction, it is necessary to introduce a component as a coordinator to control the operation results of all nodes (called participants) and finally indicate whether these nodes want to actually commit the operation results (such as writing updated data to disk, etc.). Therefore, the algorithm idea of two-stage submission can be summarized as follows: the participants notify the coordinator of the success or failure of the operation, and then the coordinator decides whether each participant should submit the operation or abort the operation according to the feedback information of all participants.

The so-called two stages are: the first stage: the preparatory stage (voting stage) and the second stage: the submission stage (implementation phase).

Preparation stage

The transaction coordinator (transaction manager) sends Prepare messages to each participant (resource manager), and each participant either returns a failure directly (such as permission verification failure), or executes the transaction locally, writing local redo and undo logs, but not committing, reaching a state of "everything is ready, only Dongfeng".

The preparation phase can be further divided into the following three steps:

1) the coordinator node asks all participant nodes whether they can perform a submit operation (vote) and starts waiting for a response from each participant node.

2) the participant node performs all transaction operations until the query is initiated, and writes Undo information and Redo information to the log. (note: if successful, each participant has actually performed a transaction operation.)

3) each participant node responds to the query initiated by the coordinator node. If the participant node's transaction operation actually succeeds, it returns a "consent" message; if the participant node's transaction operation actually fails, it returns an "abort" message.

Submission stage

If the coordinator receives a failed message from the participant or times out, a Rollback message is sent directly to each participant; otherwise, a Commit message is sent; the participant performs a commit or rollback operation according to the coordinator's instructions, releasing all lock resources used in the transaction. (note: lock resources must be released in the final phase)

Next, the process of the submission phase is discussed in two cases.

When the corresponding message received by the coordinator node from all participant nodes is "agree":

1) the coordinator node issues an "commit" request to all participant nodes.

2) the participant node officially completes the operation and releases the resources occupied during the entire transaction.

3) the participant node sends a "finish" message to the coordinator node.

4) after receiving the "done" message from all participant nodes, the coordinator node completes the transaction.

If the response message returned by any participant node in the first phase is aborted, or if the coordinator node is unable to get the response messages from all participant nodes before the challenge timeout in the first phase:

1) the coordinator node issues a "rollback operation (rollback)" request to all participant nodes.

2) the participant node performs a rollback using the previously written Undo information and releases the resources occupied during the entire transaction.

3) the participant node sends a "rollback complete" message to the coordinator node.

4) after receiving the "rollback complete" message from all participant nodes, the coordinator node cancels the transaction.

Regardless of the final outcome, the second phase ends the current transaction.

Two-phase commit does seem to provide atomic operations, but unfortunately, two-phase commit has several disadvantages:

1. Synchronous blocking problem. During execution, all participating nodes are transaction blocking. When participants occupy public resources, other third-party nodes have to be blocked to access public resources.

2. Single point of failure. Because of the importance of the coordinator, once the coordinator fails. The participants will block all the time. Especially in the second phase, when the coordinator fails, all the participants are still in the state of locking transaction resources and cannot continue to complete the transaction operation. (if the coordinator dies, a coordinator can be re-elected, but the problem that participants are blocked due to coordinator downtime cannot be resolved.)

3. The data are inconsistent. In the second phase of the two-stage submission, when the coordinator sends the commit request to the participants, a local network exception occurs or the coordinator fails in the process of sending the commit request, which results in only a part of the participants receiving the commit request. After this part of the participant receives the commit request, the commit operation is performed. However, other machines that do not receive a commit request cannot perform a transaction commit. As a result, the whole distributed system has the phenomenon of data department consistency.

4. The problem that cannot be solved in the second phase: the coordinator goes down after sending the commit message, and the only participant who receives the message is also down. So even if the coordinator produces a new coordinator through the election protocol, the status of the transaction is uncertain, and no one knows whether the transaction has been committed.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.