What is the principle of SequoiaDB distributed transaction implementation? 07/12 Update SLTechnology News&Howtos

What is the principle of SequoiaDB distributed transaction implementation?

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article will explain in detail what is the principle of SequoiaDB distributed transaction implementation. The content of the article is of high quality, so the editor will share it with you for reference. I hope you will have some understanding of the relevant knowledge after reading this article.

one

Distributed transaction background

With the development of distributed database technology, the requirements for distributed database in the industry have changed from the marginal business of storing and reading massive data to the core transaction business. If the distributed database wants to meet the needs of the core accounting transaction, it needs to improve the distributed transaction and keep up with the traditional relational database. That is to say, the implementation of distributed transactions also needs to meet the standard requirements and definition of transactions, that is, ACID characteristics, like transactions in traditional relational databases.

The data of distributed database is stored separately by multi-machine and multi-node, which brings great difficulty to the realization of distributed transaction. During the data transaction operation, the transaction operation will be performed on different storage locations according to the data distribution, and this storage location is located on different disks of different machines in the network.

two

Basic concept of transaction

2.1 transaction usage scenario

Bank application is a classic case, which can explain the necessity of transaction application. Suppose the bank database has two tables, the checking account table (check) and the deposit account table (save). Now to transfer 200 yuan from LiLei's checking account to her savings account, you need to complete at least three steps:

1. Check whether the balance of the checking account is more than 200 yuan.

two。 Subtract 200 yuan from the balance of the checking account

3. Add 200 yuan to the balance of the deposit account

All operations are packaged in a transaction, and if a step fails, all completed steps are rolled back. Transaction operations usually start a transaction with a START TRANSACTION statement, commit the entire transaction with the COMMIT statement, permanently modify the data, or roll back the entire transaction with the ROLLBACK statement to cancel the modifications. A sample transaction SQL operation is as follows:

START TRANSACTION;SELECT balance FROM check WHERE customer_id = 10233276; UPDATE check SET balance = balance-200.00 WHERE customer_id = 10233276; UPDATE save SET balance = balance + 200.00 WHERE customer_id = 10233276; COMMIT

This is the transaction operation scenario that banks must use for money transfer transactions, but in the actual production environment, the complexity of transaction operations is much more complex.

2.2 transaction concepts and features

A transaction is a collection of operation sequences that access and operate all kinds of data items in the database, such as various combinations of SQL operations for adding, deleting, modifying and querying. It is usually defined by begin transaction and end transaction statements.

The transaction of the database system should include the following characteristics:

Atomicity: all operations of a transaction either succeed or fail in the database.

Correspondence: data integrity must be consistent before and after transaction operations.

Isolation: when multiple users access the database concurrently, the database opens transactions for each user and cannot be disturbed by the operational data of other transactions. That is, each transaction does not feel that other transactions are executing concurrently in the system.

Durability: after a transaction completes successfully, its changes to the database must be permanent and will not affect the transaction even if a system failure occurs.

Transaction isolation level

For transaction isolation, the SQL standard defines four types of isolation levels, including specific rules that define which changes inside and outside the transaction are visible and which are not. Four isolation levels are described below:

READ UNCOMMITTED (read uncommitted)

At the READ UNCOMMITTED isolation level, all transactions can "see" the execution results of uncommitted transactions. Reading uncommitted data is also known as "dirty reading".

READ COMMITTED (read submission)

The default isolation level for most database systems is read committed. It satisfies the previous single definition of isolation: at the beginning of a transaction, you can only "see" the changes made by the committed transaction, and any data changes made by a transaction from the start to the commit are not visible unless it has been committed. This isolation level does not support repeatable operations. This means that the user runs the same statement twice and sees different results.

REPEATABLE READ (reread)

The REPEATABLE READ isolation level solves the problem caused by the READ UNCOMMITTED isolation level. It ensures that multiple instances of the same transaction "see the same" data rows when reading data concurrently. But in theory, this leads to another thorny problem: Phantom Read. To put it simply, phantom reading means that when the user reads a range of data rows, another transaction inserts a new row in that range, and when the user reads the range of data rows, they will find a new "phantom" row. The database storage engine can solve the problem of phantom reading through multi-version concurrency control (Multiversion Concurrency Control) mechanisms, such as InnoDB and Falcon of MySQL.

SERIALIZABLE (serializable)

SERIALIZABLE is the highest level of isolation, which solves the problem of phantom reading by forcing the ordering of transactions so that they cannot conflict with each other. In short, SERIALIZABLE is locking each row of data read. At this level, it can lead to a large number of timeouts and lock competition. It is rare to see users choose this isolation level in database applications. However, this isolation level can also be selected if the user's application needs to force the reduction of concurrency for the sake of data stability.

three

Distributed transaction

The implementation of distributed transactions needs to ensure the atomicity, consistency, isolation and persistence of transactions, and the basic technical ideas for implementing this ACID attribute are:

Atomicity, consistency and persistence of transactions are realized through the "two-phase commit (Two-phase Commit,2PC)" protocol.

The implementation of isolation level is usually guaranteed by multi-version concurrency control mechanism. The common way to achieve multi-version concurrency control is "snapshot isolation (Snapshot Isolation)" technology.

Let's first introduce these two concepts.

3.1 two-phase submission

Two-phase commit (Two-phase Commit,2PC) is a protocol designed to make all nodes based on distributed system architecture consistent when committing transactions.

The establishment of the two-phase commit algorithm is based on the following assumptions:

In this distributed system, there is one node as the transaction coordinator, the other nodes as the transaction manager, and the nodes can communicate with each other on the network.

All nodes use pre-written logs (Write Ahead Log), and the logs are kept on reliable storage devices after being written, even if the damage to the nodes does not lead to the disappearance of log data.

All nodes are not permanently damaged and can be recovered even after damage.

The following two-phase submission algorithm is described in stages.

The first phase (submit request phase)

The transaction coordinator node asks all transaction manager nodes if commit can be performed and starts waiting for a response from each transaction manager node. The transaction manager node performs all transaction operations up to the query initiation and writes Undo and Redo information to the log.

Each transaction manager node responds to the query initiated by the transaction coordinator node. If the transaction operation of the transaction manager node actually succeeds, it returns a consent message; if the transaction operation of the transaction manager node actually fails, it returns an abort message. Sometimes, the first phase is also called the voting phase, in which each transaction manager votes on whether to continue with the next commit operation.

Phase II (submission execution phase)

When realizing the multi-version concurrency control technology, the giant sequoia database not only uses the transaction lock and the old version of memory mechanism, but also uses the disk rollback segment to improve and supplement the concurrency control strategy. As we all know, memory is a high-speed storage device, but it has the problems of relatively small storage space and loss of power-off data. In order to solve this problem, the disk rollback segment mechanism persists the "old version data" in memory to disk to ensure that the database will not affect the normal operation of the transaction under abnormal conditions such as power failure.

The rollback segment uses the system collection space, named "SYSRBS". In addition, a collection is used internally, named in the format "SYSRBSXXXX", where XXXX is a circular number with a range of 0# 4096. At the same time, the rollback segment uses the first collection (that is, SYSRBS0000) to store RBS metadata, including the current RBS collection and the last free RBS collection. The Giant Sequoia database checks if MVCC is supported at startup, and if so, checks whether the "SYSRBS" collection space exists, and if not, creates this collection space, as well as SYSRBSCL0000 and SYSRBSCL0001 collections. If both the collection space and the collection of the rollback segment exist, the metadata information is read from the SYSRBSCL0000 and the next SYSRBSCLXXXX is created based on the current RBS collection and the last free RBS collection information.

To further improve the read speed, the Giant Sequoia database combines the disk rollback segment with the old version of memory, the latest version of which is still hung on the oldversionContainer of the record lock, and other older versions on disk. This satisfies the old version of read-only memory for most short data transactions, eliminating the need to read the disk, thus providing read speed. Considering the abnormal situation of the primary node, multi-version control needs to synchronize the rollback segment recording the old version data to the standby node. When the secondary node is promoted to the primary node, the old version can be rebuilt through the rollback segment.

When the transaction ID is less than the global minimum transaction ID (lowTranID), the asynchronous thread in the database background is responsible for reclaiming the old version records and Inode memory. When the old version of memory is cleaned, the saved old version is written to RBS. The old version of disk cleanup starts from the last free set (lastFreeCL) and compares the maximum transaction ID (MaxGTID) of the table one by one. If it is less than the global minimum transaction ID, you can delete the table (that is, SYSRBSCLXXXX).

The giant sequoia database realizes the multi-version concurrency control technology by using the design of transaction lock, memory old version and disk rollback segment to reconstruct the old version. Through the rational use of memory structure, this design stores the old version information of data and index, so as to realize the fast concurrent access of multi-version data.

On the SequoiaDB distributed transaction implementation principle is shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.