Mycat practice (2)-mysql Database Management 07/08 Update SLTechnology News&Howtos

Mycat practice (2)-mysql Database Management

2025-07-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Earlier, we introduced the mysql architecture based on mycat, which eliminates a single point from every link of mysql database, mycat cluster, haproxy (or lvs), and the daemon of mycat, so there is no problem in putting into production in terms of high availability and load balancing.

However, with the increasing amount of data and more and more databases, the problem of management has been raised. Especially when you are the only dba in the whole project team, it is really stressful. Without a tool to manage the whole cluster, the efficiency will be very low. At present, there are 16 mysql instances behind our mycat (only the number of masters and another 16 slaves). Each mysql instance is divided into about 10 sub-libraries, totaling about 160 sub-libraries. Because a single logical table amounts to billions or even tens of billions, those large tables in each sub-library still have tens of millions of data.

Here are some of the problems we have encountered, so I can only improve my efficiency by building a tool platform.

Some of the functions of the platform have been put into use, some are still under development and continue to be improved. Here are some of our features:

(explanation: the development of our platform uses a lot of open source programs, we do not want to rebuild the wheel, most of them are practical tools based on real problems. For example, the platform is extended on the basis of the open source program yearning, but with major changes. The back end also uses all kinds of open source software such as inception, mha, gh-ost and so on. The most important thing is that we want to manage the cluster based on mycat rather than the management of a single database.

I. Review of sql statements

What we need to consider as dba roles is to help developers write efficient and correct sql statements.

The open source software inception (which now seems to be closed) provides sql auditing under a single library, but it does not target the distributed structure of mycat, and it does not provide efficiency checks. However, his sql normative check is good, and we are also using it.)

In the past as a single library, developers generally write a correct sql is generally no problem, the efficiency may be less. But after the sub-library is transparent to the development, then there may be problems.

We check the sql of the query class in three points:

1. Whether there is a cross-library association

Although mycat supports cross-library association (which also supports only 2 tables), it will be written differently. You need to add some notes before, and performance will also be a problem. So we don't allow it.

2. How many sub-libraries do sql statements run in? when there are not many sub-libraries, it may not be a problem for a sql statement to be executed in multiple libraries, but once there are many sub-libraries and a large number of statements are queried, it will cause great pressure on mycat itself, which will lead to a decline in the processing power of mycat, data return delay and so on. For example, when we query according to the account number of a statement below, the result of the analysis will be executed in 139 sub-libraries, there will be a possible risk.

3. Efficiency of sql statement execution in back-end database

Usually we use explain to check the execution plan of sql.

When there are many sub-libraries at the back end, is it possible to have some library index problems?

The above functions are provided in our sql statement audit function. With this, developers can basically evaluate the sql themselves, but of course, in the end, I have to review the sql before it takes effect.

According to our classification, the risk level of sql statements is divided into 1-10 levels, and we can assign permissions according to different levels. For example, level 1 allows general OPS review, while level 2 or above requires DBA review.

II. DDL operation

Suppose you want to create a new table, then it depends on the way the table is divided into libraries, globally? Piece by piece? No slices? Wait

If it is an overall table or sub-library table, then our first step is to determine which sub-libraries to create tables, without tools, this is not easy. We have written some scripts to help us do these things, but it is not very convenient after all. It is also inconvenient for us to add tables to each mycat configuration file on the mycat cluster after building the tables.

It is important to be normative in the construction of tables. For example, we use tools such as pt-ocs or gh-ost, which require primary keys (or unique indexes), which are often ignored by fellow developers. In addition, comments, character sets, and so on are more likely to be ignored. So we especially want to thank inception as a tool that can help us implement the standard specification.

In terms of ddl, if you want to make changes to the existing table, this is even more sad. With the size of our table, even if it is divided into more than 100 branch tables, there are still tens of millions of records in each single table. We all know that direct manipulation of alter table on large tables on mysql can have terrible consequences. So we also need the ability to support batch processing of ddl statements.

Now the popular way to operate is through open source tools such as pt-ocs or gh-ost. Gh-ost is less intrusive and more controllable, so we adopted gh-ost. As shown on the page, when you set the time, it can basically be executed automatically.

III. High availability, monitoring and operation of mha

As we described above, we have built a high availability solution for mysql using mha. After there are many databases, the monitoring and implementation of human flesh will certainly not work. Zabbix and nagios can help us set up some surveillance. In terms of execution, it is cool to use the command line in the background, but it is not convenient after there are many examples. So we also need to have a platform for mha monitoring and operation. With this platform, sometimes we can operate directly when we need to switch manually. In this way, even ordinary operation and maintenance personnel can switch as long as they are authorized.

IV. Configuration synchronization between mycat takes effect.

Haproxy clusters mycat, and requests for databases may be sent to any mycat service, so the configuration consistency of mycat must be ensured. In the case of a single mycat, this is not a problem, but it becomes a risk point in multiple mycat situations.

Mycat is configured through files, there is no communication between mycat, and of course there is no synchronous locking and waiting operation. (it is said that version 2.0 will develop mycat clusters, so there may be no such problem.)

Ideal operation mode: mycat is configured uniformly. After the unified configuration is released, new requests are not allowed to enter each mycat, and then the new configuration is loaded. After the new configuration is completed and takes effect, the lock is available.

Realistic mode of operation: choose to leave only one mycat in the haproxy configuration when the business is idle, and comment out all the others.

After other mycat configuration changes and take effect, release the comments, uncomment the one left behind, and reload the haproxy configuration.

Modify the last mycat configuration

Finally, release all the mycat.

This series of operations are easy to make mistakes through manual operation, so we make it into the automatic scheduling mode of the program.

Suggestion: if mycat can provide a configuration version number, it is easier for us to detect (monitor) whether the configuration is up-to-date. If the timestamp is used as the version number, we can compare whether the configuration of different mycat is consistent.

Fifth, mysql instance migration scheduling management

Run many libraries (database) under one mysql instance, or one library at a time?

Our current scale, the largest logical library is divided into more than 100 libraries, of course, some are single libraries, a total of more than 150 libraries. For those larger logical tables, the number of records is usually billions to tens of billions.

When there are more sub-libraries, it will give rise to a question: how many mysql instances should we have?

Let's start with the status quo: our current scale is 15 mysql, and about 10 database are built in each mysql instance. The amount of data per instance is about 1T.

It is quite difficult to migrate, backup and restore from a single mysql to about 1T (don't tell me that if you have a slave library, you don't have to back up, that's nonsense).

The reality is that each of our mysql is very large and doesn't have shared storage, so we can't schedule it quickly (the network speed is only gigabytes, which is too long if the data is migrated).

If you divide a large mysql instance into 10 small mysql instances and a database of about 100G, it will be much more flexible.

Assuming that each mysql can be quickly dispatched to the required server (as long as the time is less than the tolerable growth time), in theory, each server can use 90% of the resources. After one server dies, schedule the 10 mysql on that server to the other 10 servers, doesn't it sound great, but under the stability requirements of 4 9s, it is impossible without shared storage.

If each library has a separate mysql instance, the problem is that it will suddenly become more than 150 instances (and then add 1-2 times the number of slave libraries), and you will immediately feel like you are going crazy with a dba.

So I insist on dividing multiple libraries on one mysql instance, even though it is inconvenient for him to have a large amount of data in a single instance.

Now that we are in the scheduling management of the platform, when there is no difference between 10 and 100, we can subdivide it. We haven't finished this yet.

Problem 6. The management of backup and recovery and the challenges after slicing

Day-to-day backup and recovery is an important task for DBA, but whether automated backup scripts are successfully executed there and whether backups are available are all questions.

So we need to do a backup and recovery management. (the backup result shows the last successful backup time in the mha diagram, and other features are under development.)

Finally, I would like to say that after the data is sliced, it brings great problems in backup and recovery.

It doesn't matter if there are a little more databases after slicing, as long as we have a management platform.

The biggest challenge is the consistency of backup time among multiple database servers.

One of the most important requirements in transactional database (oltp) is the time consistency of data.

After the sub-library, what we can guarantee is the consistency within a single database, but it is difficult to ensure the consistency between multiple databases.

For example, the data of user U1 is in D1 database, and the data of user U2 is in D2 database, if there is an associated transaction between U1 and U2 (such as money transfer, etc.).

For the collapse of D1 at this time, we must ensure the consistency of D1 and D2 libraries at the time of recovery. It's hard.

In fact, we have not been able to do so.

What can be done is to store the data of the same user in a database as much as possible, and verify and restore the order table and some redundant logs.

These tasks need to be considered at the beginning of the design.

7. The investigation and killing of slow inquiry at the scene

This function is simple, but it is very useful.

Because there are so many sub-libraries, if there are slow queries online that affect performance, we need to find them immediately and kill them. In the past, we operated by hand, and it took a lot of time no matter how skilled we were through the script.

Now it is very convenient to check and kill with one button on the page.

7. Data flashback

Now there are also open source programs that can roll back data update statements, which is easy to use.

If the statement is executed directly by connecting to the mysql, we can usually know exactly what is being executed by connecting the process number.

It is difficult to get very accurate content through the sql statement executed after mycat middleware.

At this point, you can only get a rough range from each mysql by time, database of execution, table name, and so on, and then confirm it by manual brushing. (this function has not been completed yet)

There are many other functions we need. There are a lot of plans listed at present. If you are interested, you can give me some suggestions.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.