Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Key Technical points of decrypting Tencent Database with MySQL Kernel Daniel

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Guest of this article: Lai Zheng, expert engineer of database team of Tencent TEG Infrastructure Department, responsible for the research and development of Tencent TXSQL database kernel, database system development veteran, focused on database kernel development for more than 10 years, successively worked in Dameng, Teradata, Peking University founder and MySQL InnoDB storage engine team, and is the main developer of Dameng database kernel, founder XML database and GIS support of InnoDB, encryption function. And obtained a number of patents in the database field.

This is a speech delivered by Lai Zheng, an expert engineer in the database team of Tencent TEG Infrastructure Department, at the database technology salon jointly organized by Tencent Cloud and 3306 π.

The sharing time today is about 40 minutes. Explain in detail the new features and features of Tencent Cloud database kernel TXSQL. It mainly includes three parts:

The first part is the introduction of Tencent Cloud CDB, which is the abbreviation of cloud database service Cloud DataBase Service.

The second part mainly introduces the new features and new optimizations of the new version of CDB.

The third part mainly introduces the enterprise characteristics of CDB.

Video of Lai Zheng's speech (consistent with this article)

First of all, let's introduce our Tencent Cloud CDB. As you all know, Tencent Cloud is the leading cloud service provider in China. For example, some internal businesses, such as qq and Wechat, run on Tencent Cloud, but not all of them. Other non-cloud businesses will gradually migrate to Tencent Cloud later.

These database services running on Tencent Cloud are all called TencentDB or CDB. CDB provides a very complete database service, which allows users to do some independent and controllable management, and is compatible and optimized for many applications. Tencent Cloud database service also supports a variety of database engines. For example, we are going to talk about MySQL database services today, as well as other data engines such as Redis, MongoDB, and so on. At present, the amount of data in Tencent Cloud database is so large that it has already exceeded the P level in terms of MySQL database service.

Next, let's take a look at our existing customers, including pinduoduo, Mogujie, WeBank, WeChat Pay, Sohu Changyou and so on. The core of Tencent Cloud database service, that is, the main core of TencentDB or CDB just introduced, is also called TXSQL, also known as TengXunMySQL or TencentDB For MySQL. The kernel of this database is a MySQL branch developed by Tencent itself. It is based on the official MySQL version and is at the core of our entire Tencent Cloud database service. As can be seen from this architecture diagram, the first upper layer is the access cluster Access clusters, then the middle is the TXSQLInstance cluster database instance cluster, and the lower layer is the storage cluster.

Next, let's talk about the evolution of Tencent Cloud database kernel TXSQL. The earliest TXSQL started with MySQL5.1. Because there were only limited resources at that time, only Bugfix was done. When I arrived at TXSQL5.5, I did some Features needed byOSS in addition to Bugfix. What is OSS? In fact, it is our Tencent Cloud management and control platform or cloud operating system, which supports or manages the databases on Tencent Cloud. Then we went to TXSQL5.6, and we did more things, such as what is needed for DBA management, the features needed for commercialization, and optimizations for reading and writing, and so on.

The latest version available on Tencent Cloud is version 5.7. In addition to what I just mentioned, 5.7 also provides some enterprise-level features, such as encryption, auditing, and thread pooling. Version 8.0 is now under development. Our team has basically completed version 8.0 now and should release it in the near future. We will add more new features in version 8.0, not just those just introduced. In fact, as we all know, many features have been added officially, such as the optimization of the database dictionary, the optimization of redo log, and the reconfiguration of the optimizer. 8.0 can be said to be a recent landmark version of MySQL, so the mainstream MySQL database services will be moved to 8.0 in the future. Tencent's 8.0 version provides not only the official version of these features, but also some of our own features, which I believe you will be interested in. Because I participated in the planning of version 8.0, I think this version should bring you a surprise. Please look forward to it.

Above, we have briefly introduced some information about the entire database service of Tencent Cloud, and now talk about the new features provided in the recent version of TXSQL.

Delete large tables asynchronously

This feature has been available in previous versions of TXSQL, and some optimizations have been made in the new version. It is mainly to solve a problem, I believe we have more or less encountered this problem, that is, when we drop database or drop table, when this table is particularly large, because the original MySQL is to delete that IDB file directly, and when this IDB file is particularly large, it will produce an IO spike, this spike is unacceptable to some customers, because some business may be very affected. In order to solve this problem, we have done this function of asynchronously deleting large tables. How does it do it? Do drop table, drop database will delete the file through the background thread to do, the background thread will truncate every time 128m, and then until the deletion of the IBD file.

In our latest version, we support dynamic settings for the parameters of this function, for example, you can configure the directory where temporary files are stored, and you can also turn this function on or off dynamically, set global innodb_async_truncate_work_enabled = ON/OFF, add drop table to delete asynchronously, and the stock still uses the old logic.

2. CATS transaction scheduling algorithm

CATS (Contention-Aware Transaction Scheduling) hotspot aware transaction scheduling

We know that the original transaction scheduling of MySQL is FCFS (First Come First Served) on a first-come-first-served basis, that is, it depends on who comes first and who executes first. what is the new transaction scheduling algorithm? It determines which transaction should be called based on your current hot spot or blocking situation. Let me cite a simple example: it is like a highway where everyone goes. There are motorcycles, cars and buses on the road. When there is a traffic jam, which kind of car should we let go first? should we let the bus go first? Because there are many people on the bus, the more buses go by, the more people go by. This is the basic principle of this algorithm.

As shown in the figure, T1 will be executed first, because it blocks four transactions, while T2 has only three. So the judgment is based on the number of blocked transactions, not the number of locks, because you will encounter a situation in which the transaction gets a lot of row locks, but these row locks are not requested by others, so although it has taken the row lock, there is no other transaction waiting for it, so there is no need to let it execute first. MySQL optimization is sometimes quite interesting, found that some very simple optimization points can achieve a great performance improvement.

Let's take a look at the results of the official published test:

In the first chart, the performance has nearly tripled with the increase in the number of concurrency. However, it is important to note that this new transaction scheduling algorithm is not omnipotent and can only be applied to certain scenarios. What are the scenes? It is a scenario in which a certain part of the data is particularly hot, for example, there are 20 tables, each with 10 million rows of data, and relatively speaking, it will only make some changes and relatively focus on the records of one or two tables, in this case, this transaction algorithm will be more effective. In the new version of TXSQL, we have added a new parameter to allow users to set the method of transaction scheduling:

Innodb_trx_schedule_algorithm= {AUTO, FCFS,CATS} defaults to AUTO

When AUTO exceeds the predetermined value, it invokes the transaction scheduling algorithm. Now we have more than 32 threads using this new scheduling algorithm while waiting.

III. Optimization of replication of tables without primary keys

I don't know if you have encountered this situation, that is, we found that the primary backup delay was very serious. later, it was found that there was a table that did not have a primary key and did not have any unique indexes. whether it is right to add a primary key or create a unique index, so you can obviously see that the primary and standby delay is decreasing, this is because for tables without primary keys. When the slave does binlog playback, it will scan the full table. When the table has a large amount of data, the scanning speed of the full table will be very slow, and the natural delay between the master and slave will be very serious. In view of this situation, we propose our own scheme, which automatically creates a hidden column and a unique index on this column if the primary key is not created, so that we can use this unique index to find the data that needs update when doing playback, so as to reduce the delay between master and slave.

We also have two parameters to control this function:

1. Parameter switch cdb_hidden_key_col, the default is off, and the master switch for adding hidden columns and hidden indexes in keyless tables

2. Parameter switch cdb_show_ipk_info. Default is off. Whether to hide the hidden columns added by the system is visible to the user.

Cdb_hidden_key_col is off by default, because we are doing some grayscale tests, which is temporarily off, and later it may be turned on, and when opened, a hidden column and index will be added to the non-primary key table.

The parameter cdb_show_ipk_info is whether the user can see the hidden column and index. When this parameter is turned on, you can see the hidden column through show create table, as well as its secondary index.

IV. Extension of GTID replication function

This extension has two parts:

The first part is that replication supports create as select,create/droptemporary table in gtid mode. The original MySQL official does not support the replication of such statements in GTID mode. Some customers have mentioned this requirement to us, so we have provided this feature in the new version of TXSQL. This function is also enabled with corresponding settings:

Parameter switch cdb_more_gtid_feature_supported. Default is off.

When gtid_mode=ON and enforce_gtid_consistency=ON replication support the following usage:

Create table T2 select * from t

Begin

Create temporary table xx (id int)

Insert into xx select * from T2

Insert into T1 select * from xx

Commit

The second part is to allow gtid5.7 to establish a master-slave relationship with non-gtid5.7, and non-gtid5.7 transactions are synchronized to the gtid5.7 instance to generate anonymous transactions. There is also a parameter to set:

Parameter switch cdb_allow_gtid_rep_non. Default is off.

Note: please use it during DTS migration. The task is complete. Please turn off this parameter.

These are the features added in our new version. Let's take a look at the new features that we will add in the next version.

one。 The second plus field (instant add column) function will be added soon

I believe you may be very interested in this function, because our current network users do the most DDL operation in addition to creating tables and adding fields. What kind of problem will we encounter when adding fields? It turns out that before the MySQL8.0 version, adding fields needed to copy data. The table structure of adding a field has changed. You need to copy from the original old table to another table, and you need to insert one entry at a time, especially when the amount of data in the table is very large. So how to solve this problem? our Tencent Games group has provided a solution. The original data is left there. All we have to do is mark the new data. So the solution is to mark the new data so that all copies of the data can be avoided. This function has been officially merged to 8.0, and we have also put this feature into TXSQL version 5.7, because we think this feature is very useful for 5.7 users. When adding a column, you only need to specify that your algorithm is instant. To add fields in a few milliseconds, you only need to modify the information of the data dictionary and do not need to copy the data. The whole feature will be released in the next version 5.7.

two。 Will soon join to fix the problem that event scheduled tasks are not executed on time.

Next, let's take a look at an interesting bug fix. This is the official version of bug that we just found. I don't know if you are using event. We have a customer who uses event a lot. A lucky draw is carried out at about 12:00 every night, mainly by setting the event to execute MySQL at a certain time to distribute some gold coins to some users. But the customer feedback said that I clearly set the time to execute the event at 12:10 in the evening, why does it sometimes stop doing it? sometimes it will be delayed by one day, sometimes it will be delayed by two days, and sometimes it won't even be executed all the time. Why? It's strange, and because many of their operations like this are done using event, they encounter this kind of problem very frequently. At the beginning, we were also confused, in theory, we should not, ah, we have seen the source code, it is very simple, is to set a regular execution of something, at that point in time will be implemented. Later, after a long period of tracking, we found the problem because there is a problem with the deletion algorithm of the small top heap (Priority Queue) in the MySQL source code. Let's take a look at the following figure:

The initial heap looks like this. Everyone should know that the small top pile, the smallest number is at the top, that is, at the root node. What is the event execution order of MySQL? every time, take the event of this root node to execute, because this event is the one closest to my time, and then take it out and wait until the set time to execute. Take one out, and then put the following most recent event on the root node. Let's see what happens when we delete an element (that is, drop an event). For example, to delete 7 in the first figure, the action of deleting 7 corresponds to an operation such as drop event in MySQL. When deleting the node 7, the last added element 3 will be placed in this position of 7, and a replacement will be made to look like figure 2. Then sort it, sort it down from the place where it is deleted, and make sure that the node 3 must be smaller than the nodes below it. But we found that there is a problem, for example, the deletion is not 7 this node, delete is 10 this node, what will happen? Replacing 3 to 10 will be the case in figure 3. According to its original algorithm, it adjusts its child nodes, but because 3 has no child nodes, it does not adjust, and then the heap becomes the case in figure 3. Can you see that it has violated the principle of the small top heap, that is, the parent node is always smaller than the child node, because the child node 7 in figure 3 is the parent node, and 7 is larger than this 3, so it is wrong. There will be problems in the subsequent sorting process, when the next node is taken, the 7 node will be taken first instead of the 3 node. This results in the execution of another node instead of execution at the point in time. It's a very hidden problem, which we spent a long time tracking before we finally solved it.

Next, let's talk about some of the enterprise features of the TXSQL kernel.

1. TXSQL thread pool

Everyone may have used the thread pool, if you have a lot of concurrency, you should use the thread pool, otherwise the performance of the database will decline rapidly as the number of concurrency increases. TXSQL also provides thread pool support, and the TXSQL thread pool is optimized to avoid requests to starve to death. When a low-priority user waits for a long time and reaches a predetermined value, we will raise its priority and let it execute first.

II. TXSQL audit

The audit function is not normally used, but when you find a problem, you will find how good it would be to have such a function. When you find out that I don't know who deleted the table or drop databases, an audit can track who did it and when. The general architecture diagram of the audit of TXSQL is shown in the figure:

You may be worried that turning on the audit has an impact on performance, but in fact it has very little impact on performance. Why? Because we put it in the background, that is, a backstage thread of Flush thread is constantly browsing a file like Auditfile, recording the current operation in a file and then saving it into a timing database CTSDB through Audit Agent, so it has very little impact on the original service of the entire database. As far as we test, open the audit and do not open the audit, there is only a performance loss of less than 5%.

III. TXSQL data encryption

Now, people pay more and more attention to the security of their own data, no matter customer data or other data do not want to be obtained by others. Our data encryption function will be aimed at all the data off the disk, your ibd file, as long as we are encrypted. This encryption is based on the official TDE (transparent data encryption) function. Tencent Cloud's encryption is based on the official TDE and integrates Tencent's secret key management plug-in KMS. The encrypted key is obtained through the KMS Plugin plug-in, so you don't have to worry about getting the key from someone else.

Recommended reading

Tencent's Deep thinking and practice on distributed Database Technology

3 DBA and 1 impossible task

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report