In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Introduction
Distributed database has entered a stage of comprehensive and rapid development, which keeps pace with the times and is inseparable from human needs, because the rapid development of the information age leads to more and more data and transaction volume. First of all, this phenomenon leads to storage bottlenecks, because MySQL database, in essence, is still a stand-alone version of the database, and as long as it is a stand-alone database, one of the problems it is bound to encounter is storage, because storage is a hard requirement, and if there is not enough CPU and memory, it is just poor performance and will not directly negate the solution or architecture.
In fact, every company or individual has been working hard to solve the storage problem. There are about three aspects of the solution:
Increase the disk
This method should be the most direct and simple solution, because there is insufficient disk space, of course, adding disks is a handy way to cure the disease. For example, it is now 800G, which can be increased to 2T, which is no problem. If it has now reached 2T, of course, it can still be increased to 5T, but in fact, DBA may have to be sweaty at this time. How to operate and maintain MySQL instances with such a large amount of data? If the data is broken, how to recover it? What about the time cost? The amount of data of 5T is already very frightening. It is estimated that in the major companies in the industry, no DBA wants their own MySQL instances of operation and maintenance to reach this level, right? In fact, I personally think that this is already an unacceptable quantity, and the most suitable one had better be kept below 1T. We have to find a way to go beyond this. Of course, the reason why this amount of data should not reach this size, some people may consider that this is a performance problem, but it is not true, or the performance problem is very small, because InnoDB uses the storage scheme of B+ tree, and in the worst case, small tables do not find data is to find 3 layers, that is, 3 pages of IO, while large tables need 4 pages of IO, which has little impact.
Data compression
In order to reduce the amount of disk space occupied by the data, we usually compress the data, which can be done with a single statement, which is a way supported natively by InnoDB. In general, the data occupation will be reduced to 1/3 to half of the original, and the effect is still obvious enough. However, the performance of the data processed in this way will be degraded, and the business with high response requirements will be reduced. It may need to be carefully considered, but this approach may still address the symptoms rather than the root of the problem. while the amount of data continues to grow, it will still face the same problem after a period of time. In this case, it can no longer be achieved in this way.
Data fragmentation
The solution of data sharding is now widely used in the industry, which has gone beyond MySQL itself, including HBase and Redis. It should be said that this solution is the most scalable and can be called unlimited expansion, while the above two schemes are not scalable at all. Therefore, this scheme has become the mainstream in the industry, and this scheme can be called distributed storage, and specific implementations emerge one after another. Of course, there are excellent distributed solutions and some "pseudo" distributed solutions.
Distributed solution requirements
Expansibility
In fact, the most important thing to use distribution is scalability. If there is not enough space, it is convenient and easy to expand the number of nodes and balance the data. This process should not affect the business use and be transparent to the business.
Support transaction
For the distributed database, for the business itself, the use of the aspect is not much different from the stand-alone, that is, it is transparent to the business, because the use of MySQL supports transactions, then after MySQL is transformed into distributed, the transaction characteristics can not be less, so on the whole, it is necessary to support distributed transactions.
There are no restrictions on SQL statements
The diversity of business requirements leads to a wide range of SQL requirements. For the transparency of the business, if some SQL statements are not supported, then the problem is that, on the one hand, it limits the function and performance of the business program, on the other hand, it leads to the bundling of the business program and the "distributed database".
The performance is good enough
In fact, the use of distributed database is basically for businesses with relatively low performance requirements. Even so, the higher the performance, the more people will choose such a distributed database.
Transparency of metadata change
Metadata changes exist in any database. In the case of a single point, we have a variety of friendly ways to implement table changes, but in a distributed environment, this operation may become a problem. Because data fragmentation leads to changes in metadata that require multiple modifications, many problems come, such as atomicity, data visibility, correctness and so on, so this is the most basic problem.
High availability of the underlying database
It is said that the economic base determines the superstructure, and it is the same in the distributed database. If the underlying database is unstable, or the data replication is delayed, or there is a problem of data inconsistency, then the access correctness of the upper application cannot be guaranteed. So the most basic thing for the underlying database is to ensure data consistency (high availability).
Popular distributed database solutions
Middleware sub-library sub-table (pseudo-distributed)
In the MySQL world, a long-standing topic is: which middleware implementation of sub-database and sub-table scheme is better?
Of course, different people have different understandings of the same problem, and they all have two-sided characteristics. Some people say yes, while others say no. Let's first take a look at how this scheme is realized.
The implementation architecture of MyCat is probably shown in the above figure. In fact, if you only look at the figure, such an architecture is really perfect. Automatic fragmentation, automatic aggregation and distribution balance are all achieved very well. But this is not the case.
Through questions and answers, we can identify the core questions of this approach step by step:
How does MyCat know the principle of data fragmentation, or how does he determine the routing path?
This question can not be seen in the figure. How does MyCat define or determine how a SQL statement is executed, or how does it decide where to fetch data and write to it? To solve such a problem, it needs to be stored somewhere, and it does so by using schema.xml configuration files. Then there will be many problems. After modifying the rules of sub-database and sub-table, how to ensure that the configuration and data are updated synchronously? Even if you do not use the schema.xml configuration file, but use the database storage, there is no way to guarantee the consistency between the change of configuration rules and the migration of data in the data node, which is bound to have an impact on the business.
If you need to expand the node, and you need to do rebalance, how to do it?
Many users basically re-prepare a set of clusters, or manually export and import data to the target node, then manually modify the schema.xml configuration file, and then do a reload operation, so as to achieve rerouting, but this will also lead to the above results. And this process, need to deal with a lot of data, after the completion of a variety of checks, and take up twice the space, so that as long as a DBA once, there may be an impulse to resign.
What is the global table?
MyCat supports a so-called global table, which is used to solve the problem of cross-node data aggregation. The implementation method is to create such a global table on each shard. Its definition is not very modified, and the query table with more frequent queries can be defined as a global table. In this way, as long as this table is used on each shard node, operations such as local query join can be realized, which can solve some problems. But the question is, if there are more fragments (if there are 100), how to ensure data consistency? What is the impact of XA transactions on so many nodes? If there are inconsistencies or access errors, the problem is the wrong data result, which is definitely not what the business wants to see, is it? This is not the most important thing, a database cluster, what is the reason for such a special treatment?
What on earth did MyCat do?
As a middle tier, the job should be to receive SQL requests from the client, then through syntax analysis, according to the read-write principle, then determine a read-write node in a cluster, and then wait for the return of the result set. The middle layer does not need to care about the result set itself, it just needs to send the result set (or exception) back to the client. MyCat does much more than this. After syntax analysis, we do semantic analysis, get the corresponding database table structure, judge the distribution routing rules of the table, find the data in the statement and the columns involved, and then decide which shard to route to. If the routing rules are misconfigured during distribution, or the program miscalculates, it will lead to unpredictable problems in the result of the whole statement. Excuse me, is this the first half of what a middle layer should do? You even have to care about the table structure, primary key, data and other information involved in the statement, which is actually something the database has to do, but it is actually meddling. Again, can you do these things better than a professional database? Let's take a look at the second half. After receiving the result set, in order to deal with the aggregate calculation of some result sets, MyCat needs to parse the result set (binary MySQL protocol flow data) that has already been encapsulated by each node, and then calculate it through the need (this calculation may be very slow, and not all of it can be done). After the calculation is completed, it is packaged into MySQL protocol flow data and transmitted to the client. I would like to ask such a middle layer, has done so many things, how to ensure performance? And these aggregate computing Order By, Group By processing, itself is a matter of the database, in fact, it is still meddling.
Is it a bit difficult to achieve distribution through the transformation of SQL statements?
MyCat, the middle tier, represents a kind of way to claim the use of distributed databases, but this method is actually implemented by writing on SQL statements, getting SQL statements from the client and giving SQL statements to the back-end database, but these two SQL statements have been transformed, of course, this method can only be like this, because the back-end database only receives SQL statements, try to ask, a complex SQL statement query operation Through SQL transformation or rewriting, we can realize the distributed query to different shard databases. Just think about it. Although SQL statements are general and flexible, the logic of extensibility or rewriting is a bit complicated, right? Of course, some people may say, we have a background, ah, at a big deal, just change this statement to the library name and table name, and then leave the rest unchanged to each node to execute, is it all right? yes, you're right.
How to modify the data of different nodes in the same transaction?
This question is what we usually call distributed transactions. What on earth is going on? MyCat is faced with MySQL Server, that is to say, MyCat can only use SQL statements to communicate with MySQL Server, which is limited to the function of MySQL's SQL statements. How many people use MySQL XA itself in distributed implementation? if MyCat implements a cross-node data update, what else can it use without MySQL XA? There is no choice but to rely on one that is not used by many people, and there may be a lot of problems, including performance, the function of Bug, so how to ensure the reliability of the upper MyCat and even the application? Of course, based on these problems, some solutions choose not to use XA, and if some nodes fail, they choose to ignore it and not solve it. Of course, it is also possible to compromise-there is no need for a bottom line.
What is the architecture of the MyCat back-end database, and how to ensure stable, reliable and high availability?
According to some articles, you can choose master-slave replication before, now you can choose Galera Cluster, or you can choose updated MGR. Of course, we have to say that from front to back, it may indeed ensure better reliability, but there is a big problem is that the threshold of Galera is relatively high, and if you encounter problems, few people can solve them. Qunar can be called one of the few companies in the world that use Galera more and use it better), and then MGR itself will have to wait, and it will take a long time. This problem still has to go back to master-slave replication. This is an old problem. It is difficult to guarantee the consistency of master-slave replication. If MyCat uses read-write separation strategy to hit the read from the top, and this happens to be delayed. The consequence of this may be that the calculation result of the whole application is wrong, of course, it can be said that there is a delay check, then the question is, if the delay check, is there another parameter that can be configured? If the delay is more than 100 seconds, check the main database? Yes, but isn't 100 seconds a delay? That can be set to 0, and the 0 you see, do you think it's really 0? In fact, it is still the inferiority of master-slave replication. So the question is still back to the starting point, the economic base determines the superstructure, the foundation is not good, what to do at the upper level?
In the case of too many fragments, how to ensure the minimum loss of performance?
I don't know if MyCat has optimized this question, for example, 10 shards. If the execution of a statement involves these ten shards, then after rewriting the statements on each shard, the corresponding statements will be executed on these ten shards respectively. Is the execution serial or parallel? Serial, the performance is bound to decline more than 10 times, so to do better, then parallel, but the parallel approach is to MyCat this connection, create 10 threads, to deal with the implementation of these ten nodes, then more such connections, MyCat will have a great impact on the system, performance is still not good. Of course, it can also be said that connection pooling is done here, yes, it is possible, but does MyCat do this? What about the performance of this? If there is a timeout, the entire visit fails.
If there is a problem with the configuration file or configuration library, what will happen to the entire cluster?
As mentioned earlier, MyCat uses the sub-database and sub-table policy configured by schema.xml, which is a configuration file. MyCat itself is highly available. If multiple sets are configured, how to solve their synchronization problem? If there is no synchronization (or there is a problem with synchronization, or there is a delay, etc.), if a MyCat fails and the business switches to another MyCat, the situation is failure. Malfunction. Because the data are all messed up. The possible problem is that the wrong location is written, which in turn causes the data of the entire cluster to be written out. It feels worse than being deleted. For the same problem, I feel that MyCat may optimize this point and maybe store it centrally in a database instead, so that centralized management does not need synchronization. It is a good idea, but it is tantamount to putting eggs in a basket. Where will the business go if something goes wrong with this configuration library?
How does DDL work?
This problem may be a matter of concern to everyone. MyCat divides the data into unrelated shard MySQL nodes, so many table change strategies on a single point cannot be used, and DDL is a thing that must be done at the same time, so how is it guaranteed in distribution? According to my research, it seems that people who use MyCat now do it through the method of "starting to update the table structure on each node at the same time". Of course, they have to choose the middle of the night, of course, I personally think it is feasible, because after all, it has been used, and there is no better way to solve this problem. Of course, let's talk about the consequences. If we can't make seamless atomic changes, the impact on the business will not be the slightest bit. There may be a lot of problems that SQL will report that do not exist. If there is a large difference in the completion time between one statement and another statement, the two subtractive time is the failure time.
According to my research, MyCat also realizes the function of automatic failover, is this reliable?
We are now talking about a distributed architecture solution, and this problem is about a MyCat that finds that the back-end database is not connected and will switch automatically, which is very obvious. We want to be distributed. Is the problem considered by a "one" MyCat node really what he thinks it is? In other words, a node cannot guarantee that what he judges or what he sees is true, then there is misswitching in this case, and if other middle-tier nodes are not aware of this situation, or do not receive the switching message in time, there is the problem of multi-point writing, which is terrible. Isn't that a contradiction? What we want is distribution, and as a result, there is an "arbitrary" link, and the reliability has declined a lot. With regard to the problem of distributed monitoring switching, I am deeply impressed by the monitoring of Galera Cluster by mysql-sentinel, so I have to talk about it here.
How is backup done on MyCat? Can you restore a snapshot?
Speaking of backup, as a database user, there should be no one is not clear, no one will think that he is not important, right? Of course, important is important, but this kind of thing is an important or non-urgent work, but without it, we can't even pass the internal audit of the company. maybe every one of us won't want to use him.
Back to the point, what about such an important backup work on MyCat? People who understand some basic principles may know that Xtrabackup and mysqldump can also be used, but after the backup, they may still feel uncertain, because such a tool can only back up one MySQL node. If it is divided into 10 shards (10 MySQL nodes), it can be done by backing up ten times. But it should be noted that ten backup sets are produced after ten backups, not one backup set. There is no relationship between these ten backup sets, so I may ask a more extreme question at this time:
If one day (of course, we do not want to have such a day), the whole computer room is dead, of course, if "lucky" is that there is a backup, then in this case, how to restore a consistent and complete cluster snapshot of available data?
At this point, someone may quickly say that all ten backup sets have been restored and started. But the problem is that these ten do not matter, the backup time is not completed at the same time, the same table of ten slices, the latest data some are 8: 00, some are 9: 00, or some are even yesterday. Can the watch recovered in this way work? Is the query result based on such a table reliable? as one can imagine.
Of course, some people may say that our data does not need consistent snapshots, or even just need to back up metadata routing tables or configuration files, then there is no problem. If MyCat is only used to store Zabbix monitoring data, or log data, you can lose unwanted data, worthless data, then I think there is nothing wrong with it.
Some people may also say that our computer room will not hang up, or our storage is originally cross-server room, and if we hang up, we can just cut it to another computer room. Then there will be a problem again. If there is a replication delay and part of the data is lost during the switch, there will be problems for the whole cluster, because as long as there is a problem with one shard of data, there will be a problem with the whole cluster. Or another problem is, suppose your computer room really won't hang up, but the demand we often encounter is, if I want to take the data from a certain time a few days ago, then I still need to back up and restore a snapshot. What else can you say at this time? can you tell me how to do it?
Some people may have doubts, he will say that we are logical backups, so that the backup is a consistent snapshot of the entire library or table, doesn't that solve the problem? I quite agree with this student that he is absolutely right that the consistency problem can be solved perfectly, but anyone who is familiar with a little bit of MySQL knows how slow a logical backup tool like mysqldump is. Now that distributed storage is used, it must be that the amount of data is very large, so are you still using such logical backups at this time? What do you want, huh? Even if the backup is complete, have you ever tried to recover the logical data? Have you calculated how long it will take to recover a few gigabytes of data? I have a headache when I think about it. There is no turning back.
If you are already using MyCat and find that it is really too risky, how can I drop it?
This question is very good, indicating that you are already thinking about how to ensure the reliability of data and avoid risks as a person in charge of data. The risk of MyCat is indeed high, but if you have been on the "thief ship" and want to get off the boat, I may want to ask (do Zhuge Liang afterwards), why not think about it when you go on this architecture? The company's data is money, you want to go up, want to go down, toss back and forth, can it appreciate? In case the data is written out of order, no one will pay you money at this time, so you might as well go to Shangyun.
But now that we're on, let's talk about how to drop it. At present, I feel that there is only one reliable and surest way, and the steps are as follows:
Stop the business first and stop all the writing business (you don't have to find the time in the middle of the night, because you can't finish it in 12 hours).
Use the mysqldump mentioned above to make a logical backup, export all the libraries and generate .sql files.
Then find a reliable MySQL architecture (a real distributed database, or if the disk is large enough, you can not use distributed, use Share Nothing solution, maybe what you need is not distributed, just fooled), and import .sql files.
Then migrate the read business to the new database architecture.
Migrate the write business to the new database architecture, and then start them, which should be able to provide normal database services at this time.
We can pay attention to this process. How long will it take from step 1 to step 5? Of course, this is the hard time, is to move data, logical backup and recovery are so slow, I think the unit of time can be counted in days.
At this time, the person in charge can think about what I am trying to do with MyCat. The business exemption card has been hung for several days, just to make up for a wrong decision that year and regret it.
Of course, some people may have other ways, but because the fragments are independent of each other, there is still the problem of "consistent snapshot" mentioned above without stopping the business.
Finally, I would like to say that we must be cautious about the company's data and maintain a responsible attitude at all times.
Is the configuration of MyCat complex? Is it easy to get started?
We only need to look at one picture. Can you imagine this is its configuration file? After reading it, I guess you don't have any desire to use it. Many people evaluate it like this after using it:
The configuration file is as follows:
It turned out to be a XML file. What did the product manager think at that time? And then you don't want to do an optimization?
Last question, what do you do well in sub-database and sub-table now?
What else? None. This is a road of no return. Because to put it bluntly, he is a pseudo-distributed scheme, the foundation is not good, the upper layer can not do well, so he is always making up for all kinds of holes, walking very tired, tiring others and himself. Now you can go back and think about why the middleware products made by some powerful and well-known companies did not do these things, such as ProxySQL, Maxscale, MySQL Router and so on. Why? Are their skills not good? Or is there no such need? I still think that there are needs, people to people, business and business needs, is the same, but the solution may be different, they may have thought that this is the wrong path, so they will not choose to take, and MyCat this kind of solution, may have to go back to think about the future.
The practice of the Internet in dealing with large-scale online access to data
The idea of decoupling permeates every aspect of the Internet technology stack. Why do you do this? I think it should be that no one wants to drag their feet, nor do they want to move the whole body. At the MySQL database level, after using a heavyweight middle tier, you will find that Unification looks good, but such a lead is likely to move the whole body, which is actually not a good thing.
MySQL database has sprung up and been used on a large scale in the Internet field, and not a few of them are used in key business, such as accounting, order, billing and so on. In large Internet companies, the use of MySQL must be sub-database and sub-table, through a variety of vertical and horizontal segmentation, turn a database into a pile of databases, that is, the so-called database cluster. However, it is rare to see that when using MySQL, we will set up a heavyweight so-called distributed middle layer, which leads to tight coupling. Contrary to the efficient intermodal transportation of the Internet, the database cluster of the Internet should be physically discrete, and each instance can be controlled and migrated freely, that is, the so-called decoupling.
The benefit of decoupling allows you to deal with each individual instance or cluster freely, making it easy to deal with the variables brought about by the business according to the actual situation, the upgrade of the upgrade, the reduction of the capacity, define different maintenance levels for each business or database of each business, and control it flexibly and randomly.
The benefits of decoupling can improve the absolute performance of the database. The shorter the path of data from business to disk, or from disk to business, the more efficient it will be. Many people use MySQL to distribute SQL with a simple middle tier, which features clear function, simple structure, flexibility and high efficiency, and generally does not lose too much performance, just like the ProxySQL of Maxscale,Percona produced by MySQL Router,MariaDB produced by MySQL, and the Arkproxy of the normalized polar cloud boat in China. Their behavior has pointed out a direction for choosing to use the middle tier to achieve the data architecture.
The benefit of decoupling can make your database only do what the database is good at, it can ensure your data security storage, it can ensure your data access efficiently, it can ensure your data concurrent processing, it can ensure your data access flexibly, isn't that enough?
To sum up, we are once again convinced of the truth that MySQL is simple and efficient, and popular because it is efficient, so don't go astray by listening to the truth.
Of course, if you do not want to do sub-database and sub-table in the business layer to adapt to the architecture of MySQL database, but want to provide business services through a distributed database that is transparent to business, I recommend a real distributed database solution, which can solve the problems of strong storage expansion, distributed computing, transparency to business reading and writing, and friendly failover, which is their advantage and their original intention.
A truly distributed solution
In fact, there is no need to say too much about the true distributed solution, just to meet the requirements described above. And at present, there are more mature programs, the more representative products are Google's Spanner&F1, and domestic database SequoiaDB, TiDB and so on. With regard to the giant sequoia database, I wrote an article before. Interested students can take a look at "[original launch] the practice of SequoiaDB, an open source distributed database compatible with MySQL, in Qunar."
In contrast, this distributed database does not invade the business, MySQL data realizes cloud storage characteristics, is 100% compatible with MySQL, has very good scalability, and naturally supports distributed transactions, data nodes and routing nodes with very small latency, and ensures strong data consistency through consistency algorithms, all of which are based on a correct starting point to build high-rise buildings. It is bound to push the pseudo-distributed database solution based on MyCat into the abyss that no one cares about, until it is eliminated and wiped out.
Summary
MyCat users are actually quite many, now in the understanding of the industry market, I can understand them, because there is a demand, but there is really no solution, choose to use, it is helpless, after all, he is open source, scolded, there is no complaint, because free, some use what can be said? I also recommend you to try it out. Only when you know the pain, will you feel the beauty of the new plan.
The main purpose of the series of questions about MyCat described in this article is to make some conclusions in order to make the students in the industry not to continue to dig the pit. The content is limited to my current understanding and understanding of MyCat. If there are any mistakes or deficiencies, private messages are welcome to correct or add, thank you.
[introduction by the author]
Wang Zhufeng: director of Qunar Database, Executive Director of Open Source Database Committee of China computer Industry Association. Good at database development, database management and maintenance, has been committed to the research and exploration of MySQL database source code, has a deep understanding of the principle and implementation of the database. Used to work in Dameng database, engaged in database kernel development for many years, and then moved to Renren, serving as a senior database engineer. Currently, Qunar is responsible for MySQL source research and operation and maintenance, database management and automated operation and maintenance platform design, development and practice. It is the author of Inception open source project and "MySQL Operation and maintenance Internal reference". It is also one of the few Oracle ACE in MySQL direction in China.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.