Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the XA consistency problem of distributed transactions

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

How to solve the distributed transaction XA consistency problem, many novices are not very clear, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.

Large-scale business systems have the characteristics of many users and high concurrency. In this respect, it is difficult to support the performance of centralized databases (stand-alone databases), so mainstream Internet companies often adopt distributed (architecture) databases. Physically, the use of more low-end devices, the logical need for large table horizontal split to support business.

Although distributed database can solve the performance problem, the problem of transaction consistency (Consistency) is difficult to solve on distributed database.

The problem of distributed transaction is that data consistency is difficult to achieve.

As we all know, the update made by a transaction is completed by multiple independent data nodes within the distributed database system (the local transaction of each node is a transaction branch of this global transaction). It is possible that some transaction branches cannot be committed successfully.

To solve this problem, although there has long been a theoretical solution in the industry-two-phase commit protocol (2PC), and extended to the solution of distributed transaction (XA). However, there are few cases of engineering implementation and large-scale application in the industry. However, Tencent Cloud distributed database DCDB has been used in internal business for many years.

(figure: two-phase commit algorithm)

At present, DCDB has been used in more than 90% of Tencent's internal trading and billing business, and Sany heavy Industry (Tree Root Interconnection), Huitong World (G7), China Literature Group (starting Point / Chuangshi Chinese Network, etc.), WeBank, Hetai Life Insurance, Weifutong and so on are all in this product.

Tencent Cloud's first distributed database XA supports MySQL 5.7,

Tencent Cloud distributed database DCDB is a distributed database compatible with MySQL protocol, which is based on the cloud transformation of Tencent financial database (company internal code TDSQL). Currently, Tencent Cloud DCDB officially supports distributed transaction XA on MySQL 5.7 (percona branch) protocol, and has been released for developers' use in Tencent Cloud Public Cloud and Finance Cloud. Developers can apply for a DCDB instance, and after initialization, run the following sql to connect the instance for initialization:

MySQL > xa init

Query OK, 0 rows affected (0.03 sec)

Note: before initializing the xa, enable strong synchronous replication. In addition, the sql will create a xa.gtid_log_t, and users must not do anything about it in subsequent use.

To better support distributed transactions, DCDB also adds the SQL command:

1) SELECT gtid (), which gets the gtid (global uniqueness identification of the transaction) of the current distributed transaction, and returns null if the transaction is not a distributed transaction

The format of gtid:

'Gateway id'-' gateway random value'- 'serial number'-'timestamp'-'partition number', such as c46535fe-b6-dd-595db6b8-25

2) SELECT gtid_state ("gtid"), get the status of "gtid". Possible results are:

A) "COMMIT", indicating that the transaction has been or will eventually be committed

B) "ABORT", indicating that the transaction will eventually be rolled back

C) Null, because the status of the transaction will be clear after an hour, there are two possibilities:

1) query after an hour, indicating that the transaction status has been cleared

2) query within an hour to identify that the transaction will eventually be rolled back

3) Operation and maintenance commands:

Xa recover: send xa recover commands to the back-end SET and summarize them

Xa lockwait: displays the wait relationship of the current distributed transaction (you can use the dot command to convert the output to a wait diagram)

Xa show: the distributed transaction running on the current gateway

Take Python as an example, you can encode the transfer business as follows:

Db = pyMySQL.connect (host=testHost, port=testPort, user=testUser, password=testPassword, database=testDatabase) cursor = db.cursor () try: cursor.execute ("begin") # is the balance of an account Bob minus 1 query = "update t_user_balance SET balance = balance-1 where user='Bob' and balance > 1) affected = cursor.execute (query) if affected = 0: # insufficient balance Rollback transaction cursor.execute ("rollback") return # to the balance of an account John plus 1 query = "update t_user_balance SET balance = balance + 1 where user='John') cursor.execute (query) # for security reasons, it is recommended to execute 'SELECT gtid ()' here to get the id value of the current transaction To facilitate subsequent tracking of transaction execution # commit transaction cursor.execute ("commit") except pyMySQL.err.MySQLError as e: # failed, roll back transaction cursor.execute ("rollback")

The advantage of distributed transaction is that it will greatly reduce the difficulty of application development, because in some databases that do not support XA, the business system needs to be specially and skillfully designed, rather than using the database to solve the problem of data inconsistency in the transaction. This requires a high technical level of application developers, the more complex the business system, the more will increase the development costs and technical threshold, which is the main reason why most developers in the industry can only be prohibitive when facing distributed databases.

Key implementation solution of Tencent Cloud DCDB XA

1. Introduction to DCDB architecture

The diagram of the entire cluster architecture of Tencent Cloud DCDB is shown below. MySQL uses master / slave node configuration (also known as master / slave). A set of master / slave nodes is called SET, and a gateway (TProxy) is configured outside each SET to form a physical shard (Shard).

(how the gateway works)

The transaction manager (TM) required in the two-phase commit. To solve disaster recovery and simplify the architecture, Tencent Cloud DCDB implements TM in TProxy, while the gateway of DCDB is a stateless module. Through this architecture, DCDB XA can support:

(1) distributed transaction is transparent to business and compatible with stand-alone transaction syntax (start transaction/commit/rollback/savepoint)

(2) each gateway can accept and process transaction requests independently, and there is no need to coordinate node failures with other gateways without losing transactions.

(3) allow multiple statements in an explicit transaction to be sent to multiple fragments respectively.

(4) the gateway does not need persistent state or disaster recovery, and can exit or join the cluster at any time through the scheduling cluster, and the performance can be expanded.

(5) support autocommit to issue order bar statements to write and access multiple shards, etc.

The DCDB gateway also allows group by and order by to be run in streaming, which makes such operations very efficient; the gateway also supports two Shard equivalent connections using shardkey (split table keys) and subqueries using shardkey.

In the future, Tencent Cloud also plans to support advanced features such as distributed JOIN, Sparksql and second-level partitioning, and is compatible with more advanced MySQL syntax.

3. Strong synchronization and XA

Since Tencent Cloud DCDB uses strong synchronous replication by default, that is, the data of master and slave nodes are identical, XA transactions also follow the logic of strong synchronization, that is, you need to wait for the slave to confirm data synchronization before giving commit to the business. Based on strong synchronization, DCDB XA can easily deal with the following two abnormal situations.

(1) when the master node fails, it has been confirmed that the transaction data will not be lost: if the master node fails, the slave computer with the latest data and binlog will be selected as the master node, which also includes the data of all transactions that have confirmed the completion of the commit to the user.

(2) when the original master node rejoins the cluster after recovery, the unacknowledged transactions flashback automatically: if the original master node resumes reconnecting to the cluster, it will run as a slave. At this time, he may keep excess committed transactions (when the transaction is not confirmed by strong synchronization, that is, the original machine does not have relevant data), then these transactions will be flashed back. Although these transactions may have been committed within the MySQL of the original primary node, due to the strong synchronization mechanism, he does not return a commit statement to the client, which means that it is still considered an outstanding transaction. Therefore, the flashback of these transactions does not break the ACID property of the database. It is worth saying that flashback flashback is based on binlog generation to do inverse operation, which is different from database rollback rollback, flashback can do DDL operation.

Strong synchronization of Tencent Cloud DCDB is a self-developed capability of Tencent financial-grade database. Its performance is much higher than that of official semi-synchronization, which is almost equal to the performance of asynchronous replication. Tencent Cloud DCDB has been used in Tencent for many years, and there has not been a single data error caused by master-slave switching or failure. Moreover, in terms of performance, it also supports the massive concurrency of all kinds of large-scale operational activities of Tencent, such as red packets and large-scale promotion of all kinds of games. The main reason is that strong synchronization adopts asynchronous submit / wait mode and does not occupy database worker threads.

4. Concurrency control and isolation level

In order to achieve the balance between data consistency and performance, the key of distributed transaction is database isolation control. The isolation level of XA can be up to serializable (full serialization), and there will be no problem of phantom reading. The serializable level can be set for all physical shards of DCDB (and the MySQL database hosted on it) by setting SET global tx_isolation='serializable'. Of course, you can also adjust the database instance performance by adjusting the isolation level. Theoretically, Read Uncommitted has the highest performance, but there may be dirty reads and phantom reads.

(1) when the gateway executes the insert/update/delete statement of a transaction, it will record which SET the statement modified.

(2) when SET, a XA START is sent to start the transaction branch on the SET; (note: when the XA transaction starts, it is not sure which commit mode the transaction will be executed, so a transaction is always opened with xa start)

(3) check whether it affects the number of SET ≤ 1, and if so, do one-phase submission (xa commit one phase) directly.

(4) if the number of SET is affected by ≥ 2, it will be submitted in two stages:

1) the gateway first sends xa prepare'gtid' to the participating SET (2 or more SET)

2) SET receives xa prepare reply ok (indicates successful confirmation)

3) after receiving the confirmation of success, write the commit log corresponding to XA, and then send xa commit'gtid' to participate in SET

4) if an SET returns an error, or if writing to the commit log fails, the gateway sends the xa rollback'gtid' to the relevant SET so that the global transaction is rolled back.

The commit log of Tencent Cloud DCDB is stored in SET, and this step is completed in batches-the gateway backend thread aggregates the committed distributed transactions and then writes each SET in separate connections and transactions, and the commit log of each transaction is only written to one SET, so this overhead does not significantly increase transaction commit time or reduce TPS. Moreover, depending on the strong synchronization and disaster recovery features of Tencent Cloud DCDB, as long as the XA is successfully written to the commit log, it means that the data has been written to the slave.

Although the vast majority of XA transactions are executed normally. However, a few exceptions will still affect the stability of the entire cluster. Therefore, Tencent Cloud has designed agent (monitoring module) to continue to assist in the submission of prepared transactions on the local MySQL after a failure, that is, agent will parse the commit log and process the local transaction data that is still in prepared according to the exception. If there is no transaction commit decision on the commit log, agent will also roll back the prepared local transactions that have timed out.

Although XA has long been implemented in MySQL 5.5,5.6 and other versions, these two versions still lack performance compared with 5.7. therefore, Tencent Cloud only supports XA version 5.7.17 on the public cloud. Today, Tencent Cloud has made a large number of optimizations and related bug fixes in MySQL, percona, MariaDB and other branches (some of which have been submitted to the community for patch or open source repair). In the future, Tencent Cloud will continue to work on the development of new features and related Bug repair, so as to provide better distributed database support for many enterprises in need.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report