Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Tencent TDSQL put forward three "database questions". What are the key points of database technology in the future?

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Brief introduction of the author:

Li Haixiang, whose net name is "that Sea Blue", is a technical expert in Tencent Financial Cloud Database. Master of Engineering, School of Information, Renmin University of China. Author of "the Art of Database transaction processing: transaction Management and concurrent access Control", "the Art of Database query Optimizer: principle Analysis and SQL performance Optimization", "big data Management".

Preface

From October 11 to 13, 2019, the CCF Database Special Committee held the largest and annual database academic event in China-- the 36th CCF China Database academic Conference (NDBC 2019). The Tencent TDSQL team was invited to give a technical report entitled "thinking and practice of TDSQL on the Future distributed Database Technology Research and Development" at the "Database Industry-University-Research Cooperation Forum". Discuss the basic problems and technological development direction of domestic database together with domestic database academia and industry core technicians, so as to promote the independent and controllable development of database technology in our country.

At the same time, at this meeting, Li Haixiang, Tencent Cloud database technology expert, financial-level distributed database TDSQL team expert engineer, and enterprise tutor for the Master of Engineering, School of Information, Renmin University of China, was elected as a member of the Database Select Committee of the Chinese computer Society (CCF). In the future, Tencent will increase investment to promote China's database industry-university-research cooperation and promote the independent and controllable development of database technology.

As a technology company committed to basic technology research and development, this NDBC 2019 meeting, Tencent TDSQL team brought TDSQL's deep thinking and practice sharing on the research and development of distributed database technology, looking forward to attracting more thoughts. It mainly includes three aspects:

1) the efficiency and correctness of distributed transactions, how to improve the processing efficiency of distributed transactional clusters on the premise of ensuring double consistency (transaction consistency, distributed consistency)?

2) how do new hardware and AI technologies affect the database architecture in the cloud environment?

3) can each module of the database be decoupled to reduce the complexity of R & D and shorten the training cycle of R & D talents?

The development of Tencent TDSQL database

TDSQL is a financial-level distributed database built by Tencent, which internally supports nearly 90% of Tencent's financial, transaction and billing businesses. In 2007, as the company's business took off again, the TDSQL team launched a 724 high-availability service project to ensure high availability of company-level sensitive businesses such as Tencent billing, zero loss of core data and zero error in core transactions. This is also the predecessor of TDSQL.

However, considering the technical scheme selected at that time, there is a deep coupling between the technical layer and the business layer. As a result, Tencent's technical team began a project to develop a financial-grade database. The implementation allows the database to solve the problems of high availability, data consistency, horizontal scalability, and so on, while the business system only needs to focus on business logic. In 2012, TDSQL, a standardized financial-grade distributed relational database product, was developed and widely used within Tencent.

In more than a decade of R & D and evolution, TDSQL has made continuous optimization in terms of high availability and distribution, while constantly improving performance, with the characteristics of global deployment architecture, horizontal scalability, enterprise security and so on.

For example, in 2018, TDSQL implemented the original algorithm to solve read consistency comprehensively, which unifies the consistency of distributed transactions and the consistency of distributed systems. In the same year, TDSQL also provided two sharp tools for cloud database operation and maintenance, which are quite a headache in the industry: "Red Rabbit" operation and management platform and "Bian Que" intelligent DBA diagnosis system.

Since 2014, TDSQL has been open to the public through Tencent Financial Cloud platform. At present, TDSQL has provided 500 + institutions with public cloud and proprietary cloud services for databases, with customers covering billing, third-party payments, banking, insurance, Internet finance, Internet of things, Internet +, government affairs and other fields, helping customers switch from international databases to autonomous and controllable distributed databases.

This year, Tencent Cloud TDSQL helped Zhangjiagang Bank successfully transform the bank's traditional core system from centralized database storage to distributed database storage. this is the first time that domestic banks have adopted domestic distributed databases under the traditional core business system scenario, breaking the long-term dependence on foreign databases in this field.

Second, TDSQL distributed transaction processing technology: efficient double consistency of distributed transactions

First of all, let's share the exploration and practice of TDSQL in realizing "double consistency (transaction consistency, distributed consistency)" and improving the processing efficiency of distributed transactional clusters.

As we all know, the database is a highly concurrent system, and all operations are constrained by the semantics of the transaction. The semantics of the transaction is represented by the four characteristics of the transaction-ACID: atomicity (A), consistency (C), isolation (I), persistence (D). The core technology of a database system is transaction processing technology. In order to ensure ACID, the database uses a variety of complex technologies, among which the core of the technology is the concurrent access control algorithm.

Transaction processing technology has two original intentions: one is data correctness, the other is concurrent high efficiency. TDSQL's distributed transaction processing model has experienced two generations. The first generation uses 2PL+MVCC technology and the second generation uses OCC+2PL+MVCC. OCC technology integrates 2PL to solve the problem of high conflict and inefficiency, OCC integrates MVCC to eliminate the concurrency problem of read-write and write-read blocking each other, and further improves the performance. Adaptive OCC makes the dynamic automatic switch between OCC and 2PL, which makes the distributed transaction processing mechanism more intelligent.

However, this is not enough to reflect how TDSQL works to improve the efficiency of distributed transactions. In architecture, TDSQL is a decentralized architecture, and there is no single bottleneck of dealing with distributed transactions like centralized single Master. The amount of related distributed transaction control information transferred between transaction coordinators is optimized. The conflict granularity of distributed concurrent access control algorithm is controlled at the data item level, which improves the concurrency of transactions, so it is more efficient. In addition, there are many other optimizations that make the distributed transaction processing of TDSQL more efficient.

And we continue to explore, as in figure 1, how to achieve "double consistency (transaction consistency, distributed consistency) and improve the processing efficiency of distributed transactional clusters in a distributed context?"

Figure 1 problems in implementing distributed transactions

This problem is a difficult problem for the industry. Google's Spanner system achieves double consistency, but the efficiency of transaction processing is very low. When TDSQL deeply studies the technology of distributed transaction processing, it not only solves the problem of global consistency (2019DTCC conference sharing: global read consistency of distributed database), but also puts forward a "unified causality model", which not only realizes the function of double consistency in correctness, but also solves the problem efficiently.

In TDSQL's view, the correctness of double consistency is relatively easy to achieve (although this is also a difficult problem to solve), but the performance of distributed transactional databases is difficult to improve effectively.

So, what factors restrict the improvement of the performance of distributed transactional databases?

For example, in figure 2, some researchers believe that network bandwidth limits performance; some researchers believe that there are two factors restricting the improvement of the performance of distributed transactional databases: "latency" itself, and "latency" prolongs the transaction life cycle, while the long transaction life cycle increases the probability of concurrent transaction conflicts, which leads to transaction rollback and degrades performance.

Figure 2 bottleneck of distributed transactions

In addition, what affects the correctness and performance is the core technology of transaction processing-concurrent access control algorithm.

As shown in figure 3, experiments show that the OCC algorithm is more efficient in the transactional database, and the OCC algorithm is 170times better than the 2PL algorithm in a multi-core environment. However, under high concurrency conflicts, the rollback rate of OCC increases, indicating that the shortcomings of OCC algorithm are also obvious.

Fig. 3 advantages and disadvantages of concurrent access control algorithm

However, some researchers have verified a variety of concurrent access control algorithms, such as figure 4, and found that the traditional OCC algorithm is more effective than many well-known improved OCC algorithms (such as the well-known Tictoc, adaptive OCC and so on). This shows that although different systems implemented by different people adopt the same algorithm, the actual results are quite different (for example, the test results of Tictoc itself show that the improved OCC algorithm is more efficient than the traditional OCC algorithm). So, we are thinking, different experiments come to different conclusions, behind it, what are the factors that really affect the efficiency of distributed transactions?

Fig. 4 comparison of multiple concurrent access control algorithms

Further discussion, as shown in figure 5, different researchers have shown that adaptive OCC (OCC+2PL) has better performance (the subgraph in the middle of figure 5). Combining figures 3, 4 and 5, we can find that the verification results of different researchers can not be inferred from each other. Their verification results can only show the general trend between algorithms (for example, OCC performance will be better than 2PL), but can not accurately indicate where the differences between algorithms are.

Fig. 5 comparison of multiple concurrent access control algorithms

Compared with figure 6, Tencent has done hotspot update on MySQL and found that in the case of high concurrency and high competition for the same data item, it is not the 2PL algorithm itself that affects the performance of MySQL, but the CPU resources consumed by the deadlock detection algorithm to solve the deadlock problem, so the transaction throughput of MySQL is close to zero. After deadlock detection is disabled and system locks (non-transaction locks) are used to exclude concurrency competition on the same data item, the transaction throughput of MySQL system increases by about ten thousand times.

Fig. 6 Real problems in a real system

This shows that the results of the same algorithm implemented in different systems are only of reference value. If it is implemented in the actual system, such as MySQL and PostgreSQL, it will have more practical reference significance.

Architecture and decoupling of distributed database

In the process of developing distributed transactional database, TDSQL team not only thinks about distributed transaction processing technology (all the technologies implemented by ACID), but also explores various important issues such as test verification, architecture extension, module decoupling and so on.

How do new hardware and AI technologies affect the architecture of the database in the cloud environment?

Can the modules of the database be decoupled to reduce the complexity of R & D and shorten the training cycle of R & D talents?

Technologies such as new hardware and AI have a profound architectural impact on traditional databases, which is reflected in how to integrate these new technologies:

First of all, the database may "add" a lot of new modules, such as the left sub-figure in figure 7, AI tuning database technology makes the database system expanded, adding a lot of new components.

Secondly, the traditional module of the database will be changed, such as the left sub-graph in figure 8. In the parallel transactional database system, a transaction optimization model based on AI technology is proposed. The model adopts the way of stored procedures (similar to H-Store and VoltDB), provides the executed transactions to the database engine in advance, and then uses AI technology (Markov model, Markov model) to analyze the stored procedures, determine the semantics of the transactions represented by those stored procedures, and arrange which conflicts with each other when the transactions are executed concurrently, so as to get a transaction execution model with fixed structure. On the right side of the lower left subgraph of figure 8 is the transaction scheduling diagram obtained from the analysis of the TPC-C model NewOrder.

When multiple Client issue SQL statements to execute concurrent transactions represented by stored procedures, the transaction scheduling mode can be inferred from this model. This is a typical example of AI technology changing the concurrent access control module in transaction processing.

The upper-right subgraph in figure 8 shows the impact of RDMA on transaction processing technology. It shows four models. The "d" model is based on RDMA to affect transaction processing from two aspects: one is the control flow of transaction processing, and the other is the data flow that occurs during transaction execution. The efficiency of distributed transaction processing is affected not only by the huge data flow, but also by the control flow with relatively small amount of data, so it is necessary to introduce RDMA to solve the network bandwidth bottleneck.

Figure 7 the database architecture has changed

Figure 8 the module in the database has changed

The complexity of the traditional database system is extremely high, with high cohesion from the outside and high coupling from the inside, which makes the complexity of the database increase suddenly. When various new technologies come into being and affect the architecture of the database, the complexity of the database is raised to a higher level. In this context, the cultivation of R & D talents, its growth cycle will be longer. Therefore, one of the questions we are thinking about is: technically, how to decouple many modules within the database? With a high degree of coupling, R & D personnel need to master several related modules in order to promote the work well; if the decoupling between modules is good, mastering a single module can facilitate the promotion of work, so that the training cycle of talents will be shortened accordingly. The quality of the software will also be improved.

Therefore, the decoupling of each module under the background of database architecture is a technical problem. Decoupling work can be carried out at many levels and among many modules. Decoupling technology, each has its own advantages.

As shown in the upper-right subgraph of figure 7, the storage computing separation proposed by AWS's Aurora is the decoupling of the storage and computing modules. Microsoft Deuteronomy system also had a series of related work in 2008-16. The initial solution of Deuteronomy is to implement transactions on top of the storage layer, while the underlying storage uses the KV model. The storage layer only needs to provide the atomicity and idempotency of KV, and the upper layer can easily realize the concurrent access control and recovery of transactions. Later Percolator, Spanner/F1, CockroachDB, and TiDB also developed along this line of thinking. The bottom layer is a KV storage engine such as Bigtable/Spanner or RocksDB, which encapsulates a layer of transactions on top of storage. However, in KV storage such as RocksDB, the concurrency control of KV records is tightly coupled with storage.

The decoupling of the storage and computing modules promotes the decoupling between the respective sub-modules, as shown in figure 9, how to decouple the transaction and storage layer? Some researchers extract the transaction function to the client (the left subgraph of figure 9), while others put the transaction function into the middleware layer and press it (the middle subgraph of figure 9). These two ways are different from the traditional transaction processing on the server side (the right subgraph of figure 9).

Figure 9 transaction and storage layer decoupling

In addition, decoupling work is everywhere. Figure 10 shows the decoupling between the algorithm and the data structure. The left subgraph of figure 10 is a design decoupling between the persistence part of the database and the data in memory. The right subgraph of figure 10 is the decoupling between the data structure of the index and the physical storage layer.

The left subgraph of figure 10, corresponding to the paper "FineLine: Log-structured Transactional Storage and Recovery" of VLDB 2018, proposes a transaction storage and recovery mechanism FineLine, which abandons the traditional WAL and stores all the persistent data into a single data structure, hoping to decouple the persistent part of the database from the data storage in memory.

FineLine does not need to drop the data in memory to DB, only persists the log information in memory to Indexed log, and then reads the latest state of the data from Indexed log through fetch operation. By decoupling the in-memory data structure from its persistence representation as much as possible, many of the overhead associated with traditional disk-based RDBMS is eliminated. In addition, another benefit of this single persistent storage architecture is that the cost of recovery after a system failure is very low. Because Indexed log is consistent with atomic operations, the latest data records that have been submitted can be read from Indexed log in the event of a failure and restart. No-steal-based policies, Undo operations, and Checkpoint are also not required.

Figure 10 decoupling between computing and data structures

Within the database, the decoupling between modules is closely related to the division of module granularity and the specific implementation of the system. Figure 11 shows the decoupling relationship between several mainstream databases, which is expected to lead to more thinking.

Fig. 11 comparison of decoupling of mainstream databases

Conclusion

As one of the core basic technologies, database is a big mountain that we will cross under the development trend of independent and controllable times. Although the road is clear, if we do not, we will not. After more than a decade of R & D evolution, at least today we have reached many important milestones. At present, the domestic database needs to be improved and developed from the aspects of technology, talent, industrial ecology and so on, and under the trend of closer industry-university-research combination and the integration of science and technology and traditional industries in the future, it will further promote the independent and controllable development of the database. *

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report