Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to split data in MySQL Table and Database

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "MySQL sub-table sub-database how to carry out data segmentation", the article explains the content is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in-depth, together to study and learn "MySQL sub-table sub-database how to carry out data segmentation" bar!

The distributed core content of database is nothing more than data segmentation (Sharding) and the location and integration of data after segmentation. Data segmentation is to store data in multiple databases to reduce the amount of data in a single database, and to alleviate the performance problems of a single database by expanding the number of hosts, so as to achieve the purpose of improving the operational performance of the database.

According to its segmentation type, data segmentation can be divided into two ways: vertical (vertical) segmentation and horizontal (horizontal) segmentation.

1. Vertical (longitudinal) syncopation

There are two kinds of vertical segmentation: vertical sub-database and vertical sub-table.

1.1 Vertical sub-database is to store different tables with low correlation in different databases according to business coupling. The practice is similar to the split of a large system into a number of small systems, which are divided independently according to the business classification. Similar to the practice of "microservice governance", each microservice uses a separate database. As shown in the figure:

The data tables of different modules are stored in sub-libraries. Non-interrelated query between modules

1.2 Vertical subtable is based on the "columns" in the database. If there are more fields in a table, you can create a new extended table and split the fields that are not frequently used or with larger field lengths into the extended table. In the case of a large number of fields (for example, a large table has more than 100 fields), it is easier to develop and maintain by "splitting the small table", and the spread problem can be avoided. The bottom layer of MySQL is stored through data pages, and a record takes up too much space to lead to spread pages, resulting in additional performance overhead. In addition, the database loads the data into memory in behavior units, so that the length of the fields in the table is shorter and the access frequency is higher, the memory can load more data, the hit rate is higher, and the disk IO is reduced, thus improving the performance of the database.

Solve the coupling at the business system level, the business is clear

Similar to the governance of micro-services, it can also manage, maintain, monitor and expand the data of different businesses.

In high concurrency scenarios, vertical sharding can improve the bottleneck of IO, database connections and stand-alone hardware resources to a certain extent.

Disadvantages:

Some tables cannot be join, so they can only be solved by interface aggregation, which increases the complexity of development.

Distributed transaction processing is complex

There is still the problem of excessive amount of data in a single table (horizontal segmentation is required)

two。 Horizontal (horizontal) segmentation

When it is difficult for an application to split vertically with fine granularity, or when there are a large number of rows of data after sharding, and there is a bottleneck in reading, writing and storage performance of a single database, horizontal sharding is needed.

Horizontal segmentation is divided into intra-database sub-table and sub-database sub-table, according to the inherent logical relationship of the data in the table, the same table is distributed to multiple databases or tables according to different conditions, and each table contains only part of the data. as a result, the amount of data in a single table becomes smaller and achieves the effect of distribution. As shown in the figure:

Relative vertical segmentation is the practice of classifying tables, which divides the data into different databases (or different tables) according to a certain rule for each field in the table, that is, to split the data according to several rows.

Advantages of horizontal slicing:

There is no performance bottleneck of excessive data volume and high concurrency in a single database, which can improve the stability and load capacity of the system.

The transformation of the application side is small, and there is no need to split the business module.

Disadvantages:

Transaction consistency across fragments is difficult to guarantee

Poor performance of join associative queries across libraries

It is difficult to expand the data many times and has a great amount of maintenance.

After horizontal sharding, the same table appears in multiple databases / tables, each with different contents. Several typical data slicing rules are:

2.1 based on the range of values

Divide according to time interval or ID interval. For example, data from different months or even days are distributed to different databases by date; records with userId of 1: 9999 are assigned to the first database, records of 10, 000 to 20, 000 are assigned to the second database, and so on. In a sense, the "hot and cold data separation" used in some systems migrates some less used historical data to other libraries, and the business function only provides queries for hot data, which is a similar practice.

The advantages are as follows:

The size of single table can be controlled.

Naturally, it is easy to scale horizontally. If you want to expand the capacity of the entire shard cluster later, you only need to add nodes, and there is no need to migrate other shard data.

When using fragment fields for range search, continuous fragments can quickly locate fragments for fast query, which can effectively avoid the problem of cross-fragment query.

Disadvantages:

Hot data has become a performance bottleneck. Continuous sharding may have data hotspots, such as slicing according to time field, some fragments store data in the most recent time period, which may be read and written frequently, while some historical data stored in fragments are rarely queried.

2.2 Modulus based on numerical value

Hash is generally used to take the module mod segmentation method, for example: the Customer table is divided into 4 libraries according to the cusno field, the remainder of 0 is put into the first library, the remainder of 1 is put into the second library, and so on. In this way, the data of the same user will be distributed to the same database. If the query condition has a cusno field, it can be explicitly located to the corresponding database to query.

Advantages:

Data slicing is relatively uniform and is not prone to hot spots and concurrent access bottlenecks.

Disadvantages:

When the sharding cluster is expanded in the later stage, the old data needs to be migrated (this problem can be avoided by using consistent hash algorithm)

It is easy to face the complex problem of cross-fragment query. For example, in the above example, if cusno is not included in the frequently used query condition, the database cannot be located, so it is necessary to launch a query to four libraries at the same time, merge the data in memory, and return the minimum set to the application.

Thank you for your reading, the above is the content of "MySQL sub-table sub-library how to carry out data segmentation", after the study of this article, I believe you have a deeper understanding of the problem of MySQL sub-table sub-database how to carry out data segmentation, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report