LVB Review | Ding Qi dissects database performance 07/04 Update SLTechnology News&Howtos

LVB Review | Ding Qi dissects database performance

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Tencent Cloud database domestic database online technology salon is in full swing. Lin Xiaobin (Ding Qi)'s 2020 first sharing has ended on March 5. For those who have not had time to participate, don't worry. Here is a full review of the live video. Friends who can't afford the traffic can also read the text compiled by Tencent Cloud database, which is full of practical information to ensure that you can get something.

Follow the official account of "Tencent Cloud Database" and reply "0305 Dingqi" to download LVB and share PPT.

Click to view the full live playback

Review of live broadcast of picture and text

Hello everyone! I am Lin Xiaobin from Tencent Cloud Database. My net name is Ding Qi during community activities. I get to know each other with more classmates. Today, I just find an opportunity to talk about the foundation of the database and the technological evolution of Tencent's self-developed database. I believe all the students who come to listen to it should be familiar with the database, and the previous content will pass relatively quickly, mainly talking about my understanding of the typical structure of the database. Of course, I know there are a lot of MySQL sharing, so most of today's content may be familiar to many people, but if there is one or two points, you feel like a new thing, I think we have succeeded.

First of all, let's talk about the basic concept of the database. In fact, if we just want a database, the database is used to store things, and saving and fetching is its basic function.

Here, I can give an example of exposing my age-- when I was in college at the beginning of the 21st century, I used Access to do information management systems for teachers' course websites and outside companies. At that time, I thought that as long as I knew how to build an index database, I would master it all, and the database was very simple. Later, of course, it turned out that I was too simple, and only when my job had access to industrial databases such as Oracle and MySQL did I know what the real database was.

In fact, we say that since the development of the database, there will always be several challenges-- reliability, availability, security, performance and cost, etc., because today's time relationship is very difficult to talk about, so today we will talk about one point-- performance.

So where does the performance problem come from?

If the amount of data is very small, the number of visits is also very small, in fact, an Access is enough, or even a little smaller, maybe an Excel is enough, right? In fact, the real challenges to performance problems are mainly due to the following three reasons:

1. A large amount of data. The amount of data is very large, and storing and fetching affects the performance.

two。 Large concurrency. For example, there are many requests or many clients to access together.

3. Read-write mode. What does the read-write mode mean? Although it is the same data, but when we query, there will be different query requirements, such as the post list of previous forums, if you just show it in order, it is relatively simple; if you want to search for things in it, it will be a little more difficult; if you also want to do some statistics and analysis, it will be even more difficult. Different reading modes have different query pressure on the database. The more complex, such as search or analysis, the operation itself does not have the same pressure on the database as ordinary row-by-row reads, so these are the database performance problems that we really need to solve.

At this point, we need to talk about the basic architecture of MySQL, which happens to have an advantage in solving different read and write modes. The following picture should be familiar to you, which is the basic architecture of MySQL. MySQL is divided into upper and lower tiers. The upper box is called server layer, which is the server layer, which is responsible for connecting with clients, doing analyzers, optimizers and executors. The following will be divided into many types of engines. This is a unique feature of MySQL compared with other databases. It can be connected to different engines, and each engine can set its own read-write access mode and index construction mode. To respond to different query requirements.

There are many common MySQL engines, such as MyISAM, a native engine of MySQL, which was the default storage engine before MySQL5.5, but its problem is that it does not support transactions or crash-safe, that is, data may be lost when the host is powered off, so it is used less and less later.

Innodb is now the most mainstream relational database engine. What problems does it solve? First of all, it can support the ACID features of transactions, but also support crash recovery, so it has become the most mainstream MySQL engine.

The Memory engine is also used a lot, and when the performance of the early version of Innodb was not that good, we sometimes wanted to replace Innodb with a memory engine, which might be faster. Of course, we know that data will be lost when the Memory engine is restarted, but sometimes it is simply used as a cache service. Some companies' dba will like to use Memory. In fact, with the development of Innodb, it is no longer necessary to use the Memory engine. If the amount of data is small, basically all of it can be cached in Innodb.

What about the ability to write? If you say that Innodb is a disk type, you have to drop the disk when writing, in fact, on the one hand, the popularity of SSD hard drives is fast, and another very important reason is that Memory does not support concurrency. You seem to read and write very fast in a single thread, but if you have two threads to update the same table together, it has to queue up, unlike Innodb, which does not support row locks. Therefore, in terms of general usage scenarios, Memory is actually not as good as Innodb, so it is used less slowly. For example, Tencent Cloud CDB does not directly recommend or allow users to create memory engines. Infobright may also be known to veteran players, which is a storage engine that can be used as an engine for OLAP business.

What does it mean that blackhole is a black hole engine? Only the table structure, the data written into it is not stored, it will be gone directly, and it will not report an error to you, but will only tell you that the execution was successful and looks like a black hole. We will also introduce the use scenario of this kind of engine later.

Federated is a remote link engine, you build a table here, but the data does not actually exist locally, allowing you to define where the data source is, and then go to another place to get it.

There are also RocksDB and TokuDB engines, which are based on LSM or fractal tree, but not a transaction engine based on B+tree. The obvious feature of these two engines is that the compression ratio is higher than that of Innodb. The main reason is that the compression block is larger. We know that the larger the unit data compression in Innodb, the better the compression effect. For example, TokuDB is compressed by 4m blocks, so its compression ratio is high. The problem, of course, is that it consumes a little more CPU when reading.

These engines are more common, and of course you may be able to list others, but basically these are more commonly used in the community.

So what are the advantages of MySQL with such a multi-engine architecture? As we have just said, when a business not only needs TP-based business, but also needs to do inventory and analysis, if it is to change another database, such as postgresql or Oracle, it is difficult to be compatible with these two capabilities at the same time. And MySQL can not go to the engine, I can form A to create an Innodb table, form B to create an infobright table, and then the data to do synchronization in it. Later, if I want to do oltp query, I will find Innodb table query, AP query will look up infobright, at least its architecture can do so. Some students will find it strange, is it rare to see hybrid engines? In fact, this is not what MySQL does naturally. We all use Innodb as the default storage engine. In fact, MySQL's own system library uses MyISAM in 5.6 and 5.7 and previous versions, such as the user table, which is stored in the user's table, as well as the table with the name of the repository. In fact, it is put in MyISAM, so MySQL itself is practicing the hybrid engine, of course, it needs to be very cautious.

For example, such as the consistency of cross-engine transactions. We know that Innodb supports transactions, but MyISAM does not. Suppose there is a library with two tables, T1 is MyISAM table and T2 is Innodb table. If you use it separately, there is no problem. If we want a transaction, begin starts a transaction, inserts a row into T1 table, inserts a row into T2 table, and then executes a rollback. We think that since there is no commit after begin, then the last rollback is executed. Then both T1 and T2 inserts should be undone, which is the atomicity of the transaction, either succeeding or failing.

But because MyISAM does not support transactions, that is to say, immediately after the execution of insert into T1, it is persisted and written into the data, so after the execution of this sequence is completed, you will see that the data in T1 is still there and the one in T2 is gone again, which violates the atomic rules of transactions. Yes, but it is not the problem with the engine, this is because there is something wrong with our writing itself. In this way, you will know more clearly: the original feature of the transaction is implemented in the engine, the supported engine supports it, and the unsupported engine has to be ignored. I have used it in some application scenarios before. Tables with different engines in the same database are OK, but when you want to use some engine features, such as transactions, or savepoint, full-text search, or full-text search, you should pay attention to whether the engine supports it. If not, it cannot be done or mixed.

In this scenario, you may say that you don't seem to use too much. Let's take a look at the following figure and put Innodb and infobright together. Although MySQL supports it naturally, it has a problem. It doesn't look very professional. For example, your request to run AP on infobright is OK, but after all, we know that they share the same server layer. Will this cause them to compete with each other for CPU? As a matter of fact, it will, so what can I do if I want to spin it off? That's the picture below.

That means you can build two different MySQL instances, and then the master instance uses the Innodb engine. The infobright engine is used for slave instances. Because the master-slave synchronization of MySQL can support cross-index and cross-engine, that is to say, suppose there is a table T1 on the left, which is an Innodb table, while synchronizing over, T1 on slave, manually change it to a table of infobright or other engines, so that it can be synchronized. That is to say, adding and deleting changes to the main library can be synchronized to the same table name of the infobright of the slave library. After that, if you want to do an AP-type query, you can check it directly from the library. It is stored in columns. This kind of OLAP request is fast enough. This is one of the uses.

In addition, there are still many extended versions of this usage, such as the middle channel. As I can see in the figure, it is actually the natural master-slave synchronization mechanism of MySQL. Of course, we can put a data transfer component DTS in the middle, and then pull logs from the master library to the slave library to apply. This is similar. This is a scenario. What problem is solved under this scenario? The data of the main database is the same as that of the slave database, and the query logic is not the same. The slave database needs to provide a stronger capability different from that of Innodb. At this time, you can use another engine. This kind of scenario still occurs occasionally.

There's another scene, what's the use of a black hole engine?

Let's go back to the left side of the diagram, and the left side is the same. TP is written in the business MySQL, which contains the Innodb table, and then pass it to the slave library and change it to the blackhole engine. We just mentioned that the blackhole engine only has a table structure, and when the data is written in, the data is gone, so why did you send it here? Send it to binlog. The mechanism of MySQL is that when the data and log are written on the left, if you want to synchronize to the slave database, it must pass the binlog to the slave library. What do you do after receiving the log from the library? To do two things, first store the log locally, and then apply the local log, and then generate the same log from the library. If you change it into a blackhole engine, then the latter step is useless, get the log, the subsequent execution is empty execution, although there is no table, but the log is still there, then simply this can be used to store binlog, so you don't need to copy the binlog to the main database. It is not real-time to copy to the main library, and you can get binlog in real time in this way. Of course, now this only means that MySQL naturally supports such a mechanism. in fact, there are many excellent software in the community that can do the same thing. They can simulate the behavior of binlogserver, ask for binlog from the master library, and do not apply it later, so it exists locally. It just says that if we want to build a simple binlogserver, we can use the blackhole engine from the slave library, which is quite practical.

Some students say that this looks a little silly, so we might as well use those solutions like those in the community. in fact, if we can expand further, we can think of other scenarios, for example, if we want to do a distributed scenario, we usually need multiple nodes. because there's an election, right? For example, when we want to do a high-availability cluster across centers and cities, there may be many cluster nodes. For example, there are 5 nodes in A city, 4 nodes in B city, and 9 nodes run very well, right? After one has been hung up, it can be elected. What is the problem with it? The cost is relatively high, because you have to put data everywhere, so is there any way to save costs? Some people have practiced such a scheme before, that is to say, these nine nodes do not really need that much data, and some of them simply want to participate in the election. Even if you do not want to be elected, you just participate in the vote, then you can use the blackhole engine, which can synchronize the data to interact with everyone, and then participate in the vote. Of course, it has to mark itself as not elected. In this way, although you have the cost of 9 nodes, the storage you really need may only be 5 or 4, and the rest is simulated by binlog server, which is also a framework that has been implemented before.

On the contrary, MySQL has been used in large quantities in China for 12 or 13 years, and various architectures have appeared in the middle. When it comes to another scenario, you delete the data by mistake, and then you have to restore, there is a scenario like this. Suppose there is a business data table An and a business log table B in this library. Generally speaking, the log table is much larger than the data table. Because there are all kinds of running water in the middle, I want to restore a library. The recovery data is restored with yesterday's backup. Then get all the data and use binlog to catch up to the point in time I want. And if you want it in a hurry now, there happened to be a lot of business pressure yesterday, so there are a lot of updates to the log table. After the full backup is restored, then you keep using the log, and then you will find that you spend a lot of time waiting for the log table to reply, because the log table is updated a lot. If I am very clear now, do not need the log table for the time being, and restore the data table first, what can I do? Of course, there are many ways. I'm just saying that if you can do this with blackhole, you can empty the log table for it, or remove it, and then create a table with the same name, the engine Blackhole, and put it there is an empty table. Then it starts to chase logs, and its advantage is that the log table is now chasing very fast, because what are the characteristics of this engine? When you write, it will look at me as soon as I am the blackhole engine, command to come and skip it directly, so that the application is very fast, so that you can quickly achieve the goal of restoring the data table. So supporting engine is actually a bit interesting, although it is not a mainstream application, it supports this capability in stand-alone MySQL.

Let's go on to talk about the high-availability architecture of MySQL. Everyone knows how to make high-availability, usually one master, two backups or at least one master and one backup, and then write to the master library. When the master library is dead, it will be cut to the slave library. Cut from the left to the right to achieve a HA, unless the two AB hang together, bad luck, but the probability is much lower. If a machine dies is a 1/1000 probability, two hanging together is a 1/1000000 probability, this is already very small, so it is generally done. Then the mainstream implementation between An and B will be set as two groups, so that the switch is faster. You just have to cut the customer from the left to the right. This is a typical high-availability architecture of MySQL, but this is not the focus of our description today. It just says that when we talk about a database node later, it defaults to master and slave.

Let's go back to the problem of performance. How can we solve the problem of read performance? Add the machine! There are two core skills of DBA, the first is to restart, and the second is to add machines. (laughs)

So what's the addition of the machine? We see one master and many slaves. In the picture, except that An and A 'are used as master and standby to do high availability, for example, BCD itself does not have any business to write data to them directly, but it synchronizes the data from A node, that is, if you write a copy of A, it will synchronize it to A' to make it highly available. At the same time, after synchronizing the log application to BCD,BCD, we think that the data of BCD is the same as A. Then the client can come to BCD to read it. In this way, if A cannot withstand the read pressure, you can divert the read request to BCD, which is a relatively simple solution. In Tencent Cloud MySQL, there is a feature called RO group. For example, I now have three read-only instances and three read-only nodes. If you create another one, the client will have to change the configuration. In fact, most cloud services provide this capability. Set these read-only instances to a RO group, and then they share the same access method, or the same domain name or the same IP. Next, as long as you write one, and read only one, the RO group will help you do query rounds and traffic sharing.

But some students say that it is still troublesome to read, can't writing and reading also come together into the same IP? In this way, I still have to know which one to write and which group to read. After all, there are two IP: reading and writing. Is there a lazy way to add a machine or a transparent way to add a machine? Yes! That is, you can add a proxy below.

After adding a middle layer below, the underlying architecture is actually the same, except that when you access the proxy, proxy helps you do the diversion. If a write operation is sent to the main database, and if a read operation is sent to the other read nodes below, it is essentially similar. The advantage is that you can save yourself from managing it, and you do not need to consider expansion.

For example, Tencent's TDSQL supports this read-write separation mode. If you want to add a node, you don't have to worry about it. It just sends out the requirement of adding a node, which internally copies an E node for you, then restores the full amount of data and then tracks the log, and then establishes a master-slave relationship with A, etc., after you have finished, you have made a node, which can solve our reading performance problem. If read performance is solved, what about write performance? We see that the two modes just now have a common feature-- you can read many of them, and you can only write one. What if you can't write A? Sub-database, sub-table. So we say that the performance of writing can only rely on distributed sub-tables. TDSQL also supports the mode of sub-database and sub-tables, that is, you write it down and then do routing. Of course, this route should agree on the way of dividing tables and shards with the database in advance, such as creating a table, and then the user ID takes the model. This is also a common mode of sub-database and sub-tables.

Now there are some modes, such as computing and storage separation, whose ideas are actually similar, which are used to solve the same problem, that is, to apportion the read pressure or write pressure by expanding nodes horizontally, and then improve the throughput of our entire system. But in fact, MySQL is not omnipotent, in fact, you have come up with a variety of options, but there are still scenarios that cannot be supported.

For example, for example, OLAP does not support it, like what we just said, infobright is OK, but in fact, infobright does not use that much now. Actually, there is a special system to do OLAP. We can give an example later.

Relational databases like diagrams, such as MySQL, we all know that they are standard two-dimensional tables. To describe a graph can only use this recursive query method, there is a natural performance problem. You can do it. All relationships end up expressing the relationship of a two-dimensional table. It's just that this relationship is handled poorly and slowly.

Time series database, can you use an InnoDB table to store time series tables? Of course, it is also possible for insert to append from the beginning, then check it, and delete it from the beginning. It looks like a first-in-first-out queue, which can be done, but it is wasteful, because a sequential scenario only needs the function of first-in, first-out, so InnoDB provides so much reading and writing ability of intermediate data, which is actually a waste. This waste is reflected in the time series database application scenario, in the simplest scenario, the performance is not good.

There is also search, although we all know that the earliest MyISAM engines also supported some full-text search, and later, in order to replace MyISAM, Innodb officially added full-text search capabilities to Innodb, but as far as I know, no company will use Innodb in large quantities when they really want to use full-text search capabilities, but will build real full-text search engines like ES. The picture is the same, OLAP has its own OLAP system. Therefore, MySQL still has scenarios where it is not applicable, and the core reason is actually the data structure. I mentioned to you in "MySQL practice 45" that when we analyze which database to choose, the core point is to examine the data structure first. if it is a column storage requirement, we may have to query one hundred million rows per query, but I have to query only one column. There are many such scenarios. Can MySQL do it? Yes, but MySQL is row memory, and every time you fetch the first column of a hundred million rows, it has to read in all of that row, and then wastes a lot of IO resources and unnecessary CPU consumption. At this time, if there is a file dedicated to column storage, it has a file that stores the data in the first column of the table and reads it directly, which reduces a lot of IO consumption and CPU consumption. This is the advantage brought by the data structure.

Although MySQL also supports it, it is also said to add some engines to MySQL to support it, but if we use Innodb in the mainstream, it is not suitable to do this. There is also, for example, search. I obviously want an inverted table, but you have to give me a relational database, so it is essentially a data structure problem. I think this is good, MySQL should have so many inapplicable scenarios, no database can eat all the scenarios, that is troublesome. Let us give an example, such as the scene of HTAP. Just now we have a picture saying that I can add an infobright with a MySQL, but it is not the most professional. After all, it is still a stand-alone version. In fact, the more mainstream way is to replace it with a product that specializes in AP and has strong capabilities on AP, so that it can naturally support it. Moreover, for example, the TP written here can be written by yourself, and then the channels left for data transmission can be maintained, but if you replace the following kernel with a product that specifically supports this scenario capability, such as Tencent Cloud TBase, it can specifically support such scenarios and then solve them.

To make a summary, although we only discussed the performance issue, the performance problem is actually only one of the complex problems encountered in the development of the database, and it seems that it is not even the most serious most of the time, is it? Security, reliability, that's the hard part. In addition, each solution is the result of compromise, read-write separation looks beautiful, in fact, it can not solve the write problem, sub-library sub-table looks more beautiful, but the middle proxy often has to deal with SQL compatibility, because at that time your proxy will need to do a lot of data itself operations, you will encounter syntax compatibility problems. The database will encounter more challenges later, in my opinion, intelligent operation and maintenance is one of the key directions, of course, basically every system has achieved intelligent assistance, so it is good to achieve intelligent assistance. Now the volume of business is increasing, such as when we supported the Tencent conference during the epidemic, some of our intelligent capabilities will reduce the workload of DBA and improve the quality of online service, and DBA can free up its energy to do more valuable things.

QA link

Q1: tell me about the deletion incident of Weimeng?

A1: I think it's just Weimeng's bad luck this time, and this thing is quite serious. Back to the service run by Tencent, we often encounter this kind of customers and need to roll back and recover the data that has been mistakenly deleted. Another company, which is also a well-known Internet company, has recently encountered an erroneous deletion library. What about this? If you use a cloud data service, if you encounter this situation, then it is a regular operation, look for yesterday's backup, and then download the log and then apply it. Some students will say that you put it on the cloud. I am a hacker who wants to operate maliciously. If I go in and delete the data and backup, aren't you the same? In fact, the cloud will consider more points. Tencent Cloud assumes that you really have the permission to operate the database and production database, so maybe you can delete the database, but the backup cannot be deleted. There are two kinds of backups, one is a regular backup, and the other is a backup generated actively by the user. Regular backups are not allowed to be deleted. For example, if you configure to keep the data once a day for 7 days, you can only delete the data. The backup that is kept for 7 days cannot be deleted. If this really happens, we will ensure that it can be restored. Will our internal engineers have such authority? No, neither. In fact, we are separate, and the engineers who can manage the production server cannot access the backup. In fact, up to now, each company's understanding of database capabilities will not be much worse, especially the flow of talent, so I think the leader level of each company's database is not too different, and one of the benefits of cloud is that it has encountered a variety of unexpected training before, and finally built a solution that can cover live in most scenarios through continuous iteration.

Q2:TBase is a HTAP database based on PG, so why not consider the ability to add OLAP to TDSQL?

A2:TDSQL we will consider, because just said that TDSQL itself above the framework can support read-write separation, can also support sub-database sub-table, the middle layer of proxy, in fact, it can already be considered that it is fully compatible with the MySQL protocol, and then if you want to AP, in fact, it can be changed to AP node, in fact, it is also in our route, just that it has not been produced yet, just put it on the cloud.

What are the mature and reliable solutions for Q3:MySQL 's distribution table?

A3: in fact, the most mature thing I see is to find a dimensional sub-table, depending on your query mode. For example, if you are doing online education, and then suppose you are a student form, you will make such a sub-table by student ID or by student city. Basically, this is fine. I don't think it's very complicated. Of course, the problem you need to solve is that if you divide the table by city and divide the database by city, at this time you have to make a query, this query needs to count all 9-year-old students, and at this time you have to send queries to each sub-store. in this way, the cost is relatively high, the performance will be poor, of course, there are some solutions. Your business has done a distributed table, assuming that some statistical requirements, when you need to do this kind of query which is not based on the table index, what can you do? There can be a large database of aggregations. if you want to collect the data back, various indexes can be built in the database, and there may be differences in specific implementation. in the general direction, this kind of architecture seems to have not come up with any new tricks after so many years of development.

Q4:TDSQL 's sub-library and sub-table mode, what is the support for MySQL syntax?

A4: well, as long as you use the distributed sub-table mode, I would suggest that you must do a business return before you come up. At present, it seems that more than 90% of the customers can use it directly. But if you have a large amount of groupby data in your statement, it is actually compatible, but its performance will be a little worse, or it will be different from that when you are local. It will be tested at this time. And the compatibility is still relatively small in terms of our current situation.

Q5: why do companies seldom implement distributed transactions and data resharding based on middleware split-database and table-based solutions?

A5: no, we do a lot. In fact, several big companies I know are doing distributed transactions, because this can not be bypassed. After all, the distributed table is one direction, and the distributed transaction between the nodes below is the other direction. And this direction does have a greater theoretical challenge. It has its advantage, that is, the proxy layer can be done very lightly. Even the compatibility problem of proxy is solved directly and completely. Then you can do it on your own at the bottom. In fact, you are doing things like TDSQL3.0, which is currently under development. The TDSQL2.0 that we use online now is actually the standard scheme of sub-database and sub-table.

Q6: can you teach and debug MySQL?

A6: debugging MySQL depends on what problems you encounter. If it is relatively simple, such as indexing, it is relatively simple. If you encounter statements that are correct, but the performance is slow, this kind of operation tends to be operational. If it is self-tuning like MySQL, you can actually do some basic diagnosis and slow down the query log. For example, percona toolkit has some tools that you can run directly to this tool to see the conclusion. Of course, there is a problem with slow query log analysis. Many statements may be slow query, or before the peak time, it may execute 200ms. As soon as you see the slow query setting is 1s, you will think that there is no problem. However, if there is a lot of pressure, it will not hold up. At this time, if there is a system that records the execution of all the statements and then makes a diagnosis, it is certainly better than just a slow query. For example, Tencent Cloud MySQL supports audit logs. As long as you open an audit log, there will be all the information in it. Of course, you can also use DBbrian to diagnose directly and see if there is any problem in the current database. Generally, DBbrain will tell you what the problem is and how to index it. In this way, you can understand why you want to do so and how it works.

Can you give an example of the industries and scenarios that Q7:TDSQL is applicable to?

The high availability architecture of A7:MySQL can be applied to many scenarios, and the underlying security configuration will also be applied in different scenarios. For example, if TDSQL is put in the game, it may be OK to turn on async instead of full synchronization, but in the case of Zhangjiagang Bank, can TDSQL be used when it is put into the core system of the bank? We have a configuration that must be fully synchronized, with at least one master and two slaves. After the data is written in, the other two nodes do not return to me, I will not write, and I will directly report an error. At least one of them will give me feedback that it is successful before it will be written in it. So now, like TDSQL, we are located on the financial database. Then I want to use the ordinary read-write separation method. Can I use TDSQL? Yes, there's no problem. It's up to you to decide. For example, I don't want to write when two slave nodes are broken. I hope you can still read and write normally, so you can set it to degenerate asynchronous mode. There are still many applicable scenarios, regardless of the industry, or depends on the positioning of the data.

Q8: how to determine that MySQL currently needs sub-database and sub-table? Can you see it through any parameters?

A8: generally speaking, when do we need a sub-database and sub-table? the first one is very obvious. If you have a large amount of data, you can use it. For example, a machine is only 10T, but a database is almost full. At this time, the sub-database sub-table is easier to understand. If the space is not up yet, but the performance can not hold up, there can be several indicators, such as RT for writing and RT for reading and writing. Another is the read hit rate, because we know that MySQL is the structure of the B + tree, so if the number of layers is relatively small, that is, most of the indexes are in memory, then every query can be completed in memory. Finally, as the amount of data increases, that is impossible, so you have 1T of data, 100GB of memory, then 1/10 of the data will be directly in memory, assuming that it is larger. Then your hit rate will be reduced. What parameters can be seen here, that is, there is a memory hit rate in the show engine innodb status command. You generally look at the hit rate of normal online business. If it is an online OLAP, the normal hit rate should be more than 99%, 99.2% and 99.3%. For example, if it falls to 97% or 95%, then you may have to consider it. Maybe you haven't found this problem yet, the business line came to you and said why my request is slow. Why is this happening? it's because we know that InnoDB is organized by B+tree, and when you want to access a data, you start looking at the root of the tree, and you have to look for several layers to find it, and find the bottom leaf node. Although ssd is fast now, if IO is under a lot of pressure, such as 10ms, it is better for you to load 10ms on IO once. If the tree is very high and you need to access the disk during index traversal, then the performance will slow down. So, memory hit rate is a more important reference indicator. In fact, the core is slow query, that is, business feedback.

Q9: what do you think are the pain points and difficulties of cloud database at present?

A9: let's divide it into two parts. Let's talk about the users first. For example, I am a user, I use cloud database, I think the first pain point is to improve the ease of use of this product. I am a professional DBA, can I use this database to give full play to my capabilities? you can see a lot of monitoring information in Tencent Cloud database, but can I have more? Is there any way this system can help me provide more diagnostic capabilities? So that I can locate the problem faster? Then save the energy to solve this kind of business architecture problem, data architecture problem within our company, and change me from a DBA to a data architect.

Another part says that cloud service providers, such as Tencent Cloud, now have a lot of customers, and every customer asks why the database is slow. More than a dozen DBA students in our team have already gone crazy. Our system should have this ability, which can not only be solved by DBA. Our front-line architects, after-sales and even customer service students can solve most of the problems with a set of tools. With this diagnosability, we can give full play to a lot of abilities, liberate the back end, make the front line easy, and then let our customers have the tools to diagnose and give full play to their professional skills.

Q10: in addition to the cloud database platform's own backup, will the platform back up the data automatically when the CVM builds its own database?

A10: if you create a database on the cloud host, that is, the virtual machine, the virtual machine itself will have a snapshot backup, but this snapshot is only a point snapshot. For example, I accidentally deleted the data at five o'clock this afternoon, then the backup I got at 12:00 this morning may not be enough, then I still need the middle binglog. At this time, you may not have the ability to accurately catch up to five o'clock when you make a disk-level image. If you use a database, you won't have this problem, because we naturally backup data at the instance level and log level.

Q11:TDSQL is considering adding AP functionality, so what is the difference between TBASE and TDSQL positioning?

The ability of A11:TDSQL to join AP depends on how big the problem you want to solve. Let me give an example. Suppose I want to add a search ability to TDSQL, then I can also put a sphinxSE engine in it in addition to using Innodb's full-text search. After hanging a sphinx, I can also run, but you say I want to use it to support it. For example, the search on the front page of Tencent will definitely not hold up. At that time, we need to build a special search engine to support it. So we add this ability to the TP system, it can only solve the search that is very close to my TP, but does not need so much computation. I can support it. In fact, this scenario has been used a lot, because one advantage of it is that the data is closed-loop, that is, after the data goes into the TDSQL, you don't have to go out. After you finish writing here, I will analyze it here. After the analysis, you can take it away directly. This makes users easy to use, but this solution will be set as a small AP system. There are many scenarios like this AP system. For example, you have to make a report for your boss, a daily newspaper, an operation analysis system, such as the ability to add AP to the relational data TP. This is OK, so you say that I want to build it into a super search engine, then I am not here to do this.

Will Q12:proxy have a single point of problem?

A12: not one. In fact, there can be many. Proxy is actually the least easy to become a single point. For example, for example, our TDSQL defaults to three proxy, and Tencent Cloud Redis defaults to five proxy. In fact, you will find that proxy is better to add nodes, proxy has no state, hang up one, and I can just find another one to start it, so proxy is not easy to become a single point.

Q13:TDSQL accesses the database through the gateway in one master and one slave mode. During the query stress test, the cpu utilization of the primary node is very high and the standby node is very low. What may be the cause?

A13: if you only have one master and one backup, you need to confirm your sub-database rules. You may have sent all requests to the master database, but TDSQL is monitored. You can see which requests are queried on the master node and the slave node respectively. For example, if the query is all on the master node and the slave node is not, then you may want to configure your routing rules.

How to solve the data inconsistency in the Q14:MySQL dual-master model?

A14: every time I talk about this, I have to talk about it in particular. In our diagram, what we are talking about in this diagram is that you are Readonly on one side at any time, and node B is Readonly in state 1 and An is Readonly in state 2. This is a relatively common practice, but is there such a multi-node writing scheme? There are several, for example, this scheme can also write more, that is to say, the client can write to both An and B. if you use the traditional, for example, the current MySQL capability, it is up to the business side to ensure that the amount of data I write does not conflict on both sides. The left side writes a city data, the right side writes a city data, and then the two sides synchronize, so that there is no conflict. Of course, there are MRG and Innodb cluster, which can do conflict detection between transactions on its own. As far as I know, it is not used much in China, or it will not write to each other as normal. after all, if you write on both sides, even if you can avoid conflict between the two sides, the cost will be very high. Every time I have to write a transaction, I have to ask the other node if I can write. It tells me to write only if I can, and if I can't write, I give up, so this performance will slow down. So the structure that really writes more, as far as I know, some companies use it when switching from A to B, such as looking at the architecture in the picture. If I can't write for a period of time when I switch from A to B, I will stop writing A, and then B will connect the request to B after synchronization, then if you can support more writing, the intermediate cost will be higher, but after all, I can write to both An and B. in this way, I can write to both An and B at the same time, and get rid of A's updates in a very short period of time. The advantage of this business is that the business finds that it doesn't stop writing at all. But it is only used for such a short time, it naturally conflicts with the laws of physics.

Q15: do you recommend MYSQL8 now?

A15: recommended. It has been more than a year since we tested 8.0 on Tencent Cloud test environment. The performance of 8.0 is still much better than 5.7.Therefore, we should launch a MySQL8.0 version in the first half of this year.

Q16: Oracle used to be used for business, but it has strong stand-alone capacity. Can it be replaced by TDSQL or TBase now?

A16: I think ok, replacing oracle with MySQL system, has been proved to be able to do so.

Q17: master-slave replication is asynchronous, how to ensure the consistency of read data?

A17: this is a big problem, because we really don't have time to mention it today. There is an article about this in "MySQL 45 lectures". There are some methods, for example, to read it slowly deliberately, or to put some sensitive requests into the main database when querying. There are various methods, and some can be solved with GTID.

Q18: how to get started with MySQL source code?

A18: if you really want to enter the MySQL source code, you can do this, first you must have a set of MySQL source code to download, then you can open the debug log, execute a simple statement, the debug log will list all the called functions, and then you can go to the database source code according to the function to see where it is, and you can give yourself a question. For example, if you execute a select1, of course the return is whether 1 is right or not, you first give yourself a requirement, for example, I want to write a bug for it now, and return 2 at this time, then what should I do? I can try. Of course, if you have a good foundation in development, you can consider our team, our team does this naturally.

Q19: if reading and writing are separated, the master-slave synchronization delay is very large, reading the delayed data, isn't it harmful? some tuning parameters have been tried, is it time to divide the database and table?

A19: if the configuration of your master library is the same as that of the slave library, and the master library has read and write, and there is a delay when there is no pressure on the slave library, it is probably because your parallelism is not enough, that is, you may not have enabled parallel replication. You can turn on parallel replication to try. If the cpu consumption from the slave database is too high due to the excessive query pressure on the slave database, resulting in a delay, you can add several more slave libraries. Of course, sub-database sub-table is also a solution. Under what circumstances do you need sub-database and sub-table, that is, the amount of updates to your main database is so large that when parallel replication is enabled by MySQL, you may need to consider that this topic is more complicated. You can search some articles.

Live broadcast trailer tomorrow night

Live interactive welfare: Tencent dolls will also be sent to the studio every night, as well as Tencent badges, Tencent cloud vouchers and other gifts! Make an appointment and sign up!

After scanning the code and following, reply "add group" to join the salon communication group in advance.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.