Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What should be paid attention to in the sub-database and sub-table of the database

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what should be paid attention to in the sub-database and sub-table of the database". The content of the explanation in the article is simple and clear, and it is easy to learn and understand. let's study and learn "what do you need to pay attention to in the sub-database and sub-table of the database"?

Cut into the hierarchy

Below, the scope is defined in JAVA and MySQL. First of all, let's take a look at the level of sub-library and sub-table.

① coding layer

Create multiple data sources in the same project, use if else, and route directly in the code according to the conditions. There are abstract classes for dynamically switching data sources in Spring, see

AbstractRoutingDataSource .

If the project is not very large, you can divide the library quickly in this way. But the disadvantages are also obvious, requiring a lot of code to take care of each branch. When it comes to cross-database query, aggregation, the need to loop the results and merge the scenario, the workload is huge.

If the project fission, most of this kind of code can not be shared, mostly through copy sharing. In the long run, the code will not be the code.

② frame layer

This situation is suitable for a company with a unified ORM framework, but in many cases it is not realistic. The main purpose is to modify or enhance the function of the existing ORM framework and add some custom primitives or hint to SQL.

By implementing some interceptors (such as Mybatis's Interceptor interface) and adding some custom parsing to control the flow of data, the effect is good, but it will change some of the existing programming experience.

In many cases, the source code of the framework should be modified and is not recommended.

③ driver layer

Based on the shortcomings of cutting in the coding layer and the framework layer, the real database middleware should at least start from the driver layer. What do you mean? It's all about rewriting a JDBC driver, maintaining a route list in memory, and then forwarding the request to the real database connection.

Such as TDDL, ShardingJDBC, etc., are cut in this layer.

Including the Failover protocol of Mysql Connector/J

(specifically refers to "load balancing", "replication", "farbic", etc.)

It is also modified directly on the driver.

The flow of requests generally looks like this:

④ proxy layer

The database middleware of the agent layer disguises itself as a database and accepts links from the business side. Then load the request from the business side, parse or forward it to the real database.

Such as MySQL Router, MyCat, etc., are cut in this layer.

The flow of requests generally looks like this:

⑤ implementation layer

SQL special version supports, for example, Mysql cluster itself supports various features, mariadb galera cluster supports peer-to-peer dual hosts, Greenplum supports sharding, and so on.

The need to change storage, which is usually a solution, is not discussed.

Technology will eventually converge, and it is feasible to choose either one. However, the final selection is affected by many factors, such as developer familiarity, community activity, company fit, official maintenance, expansibility, and the company's existing database products. Choose or develop a suitable one, and your friends will be much happier.

Comparison between driver layer and Agent layer

Through the above level description, it is obvious that we choose or develop middleware and focus on the driver layer and the agent layer. In these two layers, database connections and routing can be more controlled and managed in more detail. But the difference is also obvious.

Characteristics of drive layer

Only JAVA and rich DB are supported.

Driver layer middleware supports only one development language, Java, but supports all back-end relational databases. If your development language is fixed and your back-end data sources are rich, this solution is recommended.

More occupied database connections

Driver layer middleware maintains a lot of database connections. For example, for a table divided into 10 libraries, the Connection in each java maintains 10 database connections. If there are too many projects, there will be a connection explosion (let's calculate that if there are 6 instances of each project, the total number of connections in the connection pool with minIdle equal to 5 is 10 / 6 / 5 / 3 = 900). Databases such as Postgres, where each connection corresponds to a process, can be stressful.

Data aggregation is performed in the business instance

Data aggregation, such as count sum, is done through multiple queries and then aggregated in the memory of the business instance.

The routing table exists in the memory of the business instance, and can be updated by polling or passive notification.

Centralized management

The configuration management of all clusters is centralized in one place, and DBA can complete the relevant operations with a small burden on operation and maintenance.

Typical implementation

Characteristics of agent layer

Heterogeneous support, limited support for DB

Proxy layer middleware is just the opposite. Only one back-end relational database is supported, but multiple development languages are supported. This solution is recommended if your system is heterogeneous and all have the same SLA requirements.

Heavy burden on operation and maintenance

The proxy layer needs to maintain a limited number of database connections (except for sticky connections like MySQL Router). But as a stand-alone service, you have to consider both separate deployment and high availability, which will add a lot of additional nodes, not to mention companies that use shadow nodes.

In addition, the proxy layer is the only entrance to the request, and the stability requirement is extremely high. Once a memory-intensive aggregate query crashes the node, it is a catastrophic accident.

Typical implementation

Common ground

Space is limited, not too much discussion. Visit the middleware promotion page, you can see a long Feature list, that is, a whitelist; you can also see a long list of restrictions, that is, a blacklist. Limit how you play, after enhancing the distributed ability, the sub-library sub-table itself is a castrated database.

Use restriction

Make sure that the data in the database is balanced and split as much as possible. For example, if the user database is not evenly divided by province, it will be more uniform if you choose the model by userid.

No deep paging without a split key, all the data before the number of pages fetched by the library will be sorted in memory. It is easy to cause memory overflow.

Reducing the subquery of the subquery will cause SQL parsing disorder and parsing errors, and minimize the subquery of SQL.

The principle of transaction minimization is to minimize the scope of libraries involved in stand-alone transactions, that is, to minimize boast library operations and divide the libraries / tables of similar operations together.

According to the principle of data balance, split the data of the database as evenly as possible, for example, dividing the user database by province is not uniform, and taking the model according to userid will be more uniform.

Special functions such as distinct, having, union, in, or, etc., are generally not supported. Or be supported, after use will increase the risk, need to be modified.

Product

It is recommended to focus on MyCat and ShardingJDBC. In addition, there are a large number of other middleware, do not move if you are not familiar with it.

Database middleware is difficult to maintain, and you will find a large number of half-dead projects.

The following list, in no particular order, has only HA functions, not split functions:

Atlas 、 Kingshard 、 DBProxy 、 mysql router 、 MaxScale 、 58 Oceanus 、 ArkProxy 、 Ctrip DAL 、 Tsharding 、 Youtube vitess 、 NetEase DDB 、 Heisenberg 、 proxysql 、 Mango 、 DDAL 、 Datahekr 、 MTAtlas 、 MTDDL 、 Zebra 、 Cobar 、 Cobar

Khan, almost every big factory has its own database middleware (and found a few who like to use open source components and company prefixes as products), but they just don't give it to us.

Process solution

No matter which level is cut into the sub-database and sub-table, it is faced with the following work process.

Information collection

Business and projects affected by statistics

The larger the scope of the project, the more difficult it is to divide the library. Sometimes, a complex SQL can involve four or five business parties, and this kind of SQL needs to be focused on.

Determine whether the size of the sub-database is divided into only a few of these tables, or all of them are involved. The more points, the greater the workload, which is almost linear.

There are also some projects that start the whole body. For example, in the following process, the link affected is more than just a sub-library.

Identify participants

In addition to the technical support staff of the sub-library and sub-table components, the people who should be most involved are the people who are most familiar with the system and the existing code. Only they can determine which SQL should be scrapped, the impact of SQL, and so on.

Determine the strategy of subdatabase and table

Determine the dimensions and sharding keys of the sub-library and sub-table. Once the syncopation key (that is, the column of routed data) is determined, it is not allowed to be modified, so in the early architecture design, it should be established first before the follow-up work can be carried out; data dimensions mostly mean that there are different sharding keys to achieve the effect of query under different conditions. This involves data redundancy (multi-write, data synchronization), which will be more complex.

Preparation in advance

Data normalization

The structure of the database table does not meet the demand and needs to be organized in advance. For example, the field names or types of syncopation keys are different. When implementing the strategy of sub-database and sub-table, these personalities will cause the strategy to be too large and difficult to maintain.

Scan all SQL

Scan all the SQL in the project and determine whether it can work according to the syncopation key one by one.

In the judgment process, there must be a large number of non-compliant SQL, then need to give a transformation plan, which is one of the main workload.

Verification tool support

It is feasible to make changes and verification directly on the original project, but it will encounter many problems, mainly because the efficiency is too low. I tend to design some verification tools first, enter the SQL or list to be verified, and then print routing information and results to determine.

Technical preparation

It is recommended that each of the points mentioned below, find an example to experience, and then estimate the difficulty according to your own team.

The following:

All unsupported SQL types of middleware

Sort out the precautions that can easily lead to collapse.

Unsupported SQL gives the processing method

Consider a general primary key generator

Consider what to do with SQL without syncopation keys

Consider how to traverse the whole library such as scheduled tasks.

Consider how to transform cross-database and cross-table queries

Prepare some toolsets

Implementation stage

Data migration

Sub-database and sub-table will re-affect the distribution of data, whether full or incremental, will involve data migration, so Databus is necessary.

An ideal state is that all additions, deletions and corrections are messages and can be double-written by subscribing to MQ.

In general, however, you still need to simulate this state, such as using Canal components.

How to ensure the safe switching of data, we discuss it in other chapters.

Adequate testing

The sub-library and sub-table must be adequately tested, and every SQL sentence must be strictly verified. If there are unit testing or automated testing tools, complete coverage is necessary. Once the data is misrouted, especially the addition, deletion and modification, it will create a lot of trouble.

In the test phase, the verification process is output to a separate log file, and after sufficient testing, the review log file has the wrong data flow.

SQL retest

It is strongly recommended to conduct a unified SQL retest. It is mainly based on the functional description to determine the correctness of SQL, that is, commonly known as review.

Exercise

The program has been rehearsed many times in the non-online environment to ensure that it is foolproof.

Develop a new SQL specification

After the sub-database and sub-table, the SQL in the project has been shackled and cannot be written at will. Many operations that are normally supported may not work in a split environment. So before launching, the SQL involved should have a validation process, even if it has been adequately tested.

Thank you for your reading. the above is the content of "what should be paid attention to in the sub-database and sub-table of the database". After the study of this article, I believe you have a deeper understanding of what you need to pay attention to in the sub-database and sub-table of the database. The specific use of the situation also needs to be verified by practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report