The usage of MySQL sub-library and sub-table 04/17 Update SLTechnology News&Howtos

The usage of MySQL sub-library and sub-table

2025-04-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The following brings you about the use of MySQL sub-library sub-table content, I believe you must have read a similar article. What's the difference between what we bring to everyone? Let's take a look at the body. I believe you will gain something after reading the use of MySQL sub-library and sub-table.

MySQL uses sublibraries and tables

What is a sub-database and sub-table?

The data originally stored in one database is stored in multiple databases, and the data originally stored in one table is stored in multiple tables.

2 Why is it necessary to divide the database and table?

The amount of data in the database is not necessarily controllable. without dividing the database and tables, with the development of time and business, there will be more and more tables in the database, and the amount of data in the tables will become larger and larger. Accordingly, the cost of adding, deleting, modifying and querying will become more and more expensive. In addition, due to the inability to carry out distributed deployment, and the resources of a CVM (CPU, disk, memory, IO, etc.) are limited, the amount of data and data processing capacity that the database can carry will encounter bottlenecks.

(3) the implementation strategy of sub-database and sub-table.

There are two kinds of sub-database table: vertical segmentation and horizontal segmentation.

3.1 what is meant by vertical segmentation, that is, tables are divided according to functional modules and close relationships, and deployed to different libraries. For example, we will set up definition database workDB, commodity database payDB, user database userDB, log database logDB, etc., which are used to store project data definition table, commodity definition table, user data table, log data table and so on.

When the amount of data in a table is too large, we can divide the data of the table according to certain rules, such as userID hash, and then store it in multiple tables with the same structure and different libraries. For example, in the user data table in our userDB, the amount of data in each table is very large, so we can cut userDB into multiple userDB:part0DB, part1DB, etc., with the same structure, and then divide the user data table userTable on userDB into many userTable:userTable0, userTable1, etc., and then store these tables on multiple userDB according to certain rules.

3.3 which method should be used to implement database sub-database and sub-table depends on the bottleneck of the amount of data in the database and the business type of the project.

If the database has a large amount of data because there are too many tables, and the business logic of the project is clearly divided and low-coupled, then the vertical segmentation which is simple and easy to implement must be the first choice.

However, if there are not many tables in the database, but the amount of data in a single table is very large, or the data heat is very high, horizontal segmentation should be chosen. Horizontal segmentation is more complicated than vertical segmentation. It physically divides the data that originally belongs to the logical whole, in addition to evaluating the granularity of the segmentation, taking into account data average and load average. At a later stage, it will also impose an additional data management burden on project personnel and applications.

In real projects, these two situations are often both, which requires tradeoffs, even vertical segmentation and horizontal segmentation. Our game project uses a combination of vertical and horizontal sharding, we first split the database vertically, and then split horizontally for some tables, usually user data tables.

(4) the problems existing in sub-database and sub-table.

4.1 transaction issues.

After the implementation of sub-database and sub-table, it is difficult to manage database transactions because the data is stored in different databases. If we rely on the distributed transaction management function of the database itself to execute the transaction, it will pay a high performance price; if the application program helps to control and form the logical transaction of the program, it will cause the burden of programming.

4.2Cross-database and cross-table join.

After the execution of sub-database and sub-table, it is hard to avoid dividing the data with strong logical correlation into different tables and different databases. At this time, the association operation of tables will be restricted. We cannot join tables in different sub-databases, nor can we join tables with different granularity. As a result, businesses that can be completed with one query may need multiple queries to complete.

4.3 additional data management burden and data operation pressure.

The additional burden of data management, the most obvious is the problem of data positioning and the repeated execution of data additions, deletions, changes and queries, which can be solved by applications, but will inevitably lead to additional logical operations. For example, for a user data table userTable that records the user's performance, the business requires to find out the top 100 digits, which can be done with only one order by statement before dividing the table. However, after the sub-table, you will need n order by statements to find out the first 100 user data of each sub-table, and then merge the data to get the results.

The following are the problems caused by the sub-database and sub-table, and the matters needing attention.

The problem of sub-database and sub-table dimension

If the user buys a commodity, the transaction record needs to be saved, and if it is divided into tables according to the latitude of the user, the transaction records of each user are saved in the same table, so it is very convenient to find the purchase situation of a certain user. however, the purchase of a commodity is likely to be distributed in multiple tables, which is more troublesome to find. On the contrary, according to the sub-table of commodity dimension, it is very convenient to find the purchase of this commodity, but it is more troublesome to find the transaction records of the buyer.

So the common solutions are:

a. By scanning the table, this method is basically impossible, and the efficiency is too low.

b. Record two pieces of data, one according to the user's latitude and the other according to the commodity dimension.

c. It is solved by search engine, but if the real-time requirement is very high, it has to be related to real-time search. The problem of federated query

A federated query is almost impossible because the associated tables may not be in the same database. Avoid cross-library transactions

Avoid modifying tables in db0 while modifying tables in db1 in a transaction, one is that the operation is more complex, and the efficiency will have a certain impact. Try to put the same group of data on the same DB CVM.

For example, put the merchandise and transaction information of seller an into db0, when the db1 is dead, the things related to seller a can be used normally. That is, to prevent data in a database from relying on data in another database.

One master, more reserves.

In practical applications, in most cases, reading is much greater than writing. Mysql provides a read-write separation mechanism, all write operations must correspond to Master, and read operations can be carried out on Master and Slave machines. The structure of Slave is exactly the same as Master. A Master can have multiple Slave, and even Slave can be hung under Slave. In this way, the QPS of DB cluster can be effectively improved.

All write operations are first operated on Master, and then synchronously updated to Slave, so there is a certain delay from Master synchronization to Slave machines. When the system is very busy, the delay problem will be more serious, and the increase in the number of Slave machines will also aggravate this problem.

In addition, we can see that Master is the bottleneck of the cluster. When there are too many writes, it will seriously affect the stability of Master. If the Master is down, the whole cluster will not work properly.

So, 1. When the reading pressure is high, you can consider adding the fractional solution of Slave machines, but when the number of Slave machines reaches a certain number, you have to consider sub-libraries. two。 When there is a lot of pressure to write, it is necessary to divide the library.

Why is it necessary to divide the library and table when using MySQL?

It can be said that the use of MySQL, as long as a large amount of data, will immediately encounter a problem, to separate databases and tables.

Here is a question: why do you want to divide the library and table? can't MySQL handle large tables?

In fact, it is a big table that can be handled. In the projects I have experienced, the physical file size of a single table is more than 80g, the number of records per table is more than 500 million, and this table

Belongs to a very useful table: the friend relationship table.

But this is not the best way to do it. Because faced with file systems such as Ext3 file system, there are many problems in dealing with larger files.

This layer can be replaced with the xfs file system. However, there is a problem that can not be solved when the MySQL single table is too large: the operation base related to table structure adjustment.

Ben is no longer possible. Therefore, the use of major items will monitor the application of sub-database and sub-table.

As far as Innodb itself is concerned, there are only two locks on the Btree of the data file, the leaf node lock and the child node lock. You can imagine that when a page split or add occurs.

New leaves will make it impossible to write data in the table.

So sub-library sub-table is a better choice.

So how much is the sub-database and sub-table?

After testing, it is recorded in 10 million items in a single table, and the write and read performance is better. If you leave some buffer in this way, then the single table is full of data fonts kept in the

There are less than 8 million records, and the single table with character type is kept below 5 million.

If you plan according to 100 libraries and 100 tables, such as user business:

5 million 100100 = 500 billion = 500 billion records.

With a number in mind, it is relatively easy to plan according to the business.

For the above on the use of MySQL sub-library sub-table, do you think it is what you want? If you want to know more about it, you can continue to follow our industry information section.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.