Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the advantages and disadvantages of database subdatabase and table

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "what are the advantages and disadvantages of database sub-tables". In daily operation, I believe many people have doubts about what are the advantages and disadvantages of database sub-tables. Xiaobian consulted all kinds of information and sorted out simple and easy to use operation methods. I hope to help you answer the doubts about "what are the advantages and disadvantages of database sub-tables"! Next, please follow the small series to learn together!

Why do we need to divide the warehouses and tables?

When a table has tens of millions of data, it takes longer to query it once. At this time, if there is a joint query, it may be stuck there and even bring down the system.

The purpose of sub-database sub-table is to reduce the burden of the database, improve the efficiency of the database, and shorten the query time. In addition, because the transformation of sub-database and sub-table is controllable, and the bottom layer is still based on RDBMS, the operation and maintenance system of the entire database and related infrastructure are reusable.

At present, our system has nearly 2 billion data, the largest of which is close to 600w/table, each data is about 3k, and each table has nearly 1.5G data. Query timeout frequently, single SQL execution count(*) query time reached a maximum of 260ms, 0.26s (standard is more than 0.1s data is slow SQL).

To illustrate why we want to divide the database into tables, let's take a look at sql execution.

MySQL executes a SQL procedure as follows:

1. Received SQL

2. Put sql in queue

3. Execution of SQL

4. Return results

The most time-consuming aspects of this implementation are:

1. Waiting time in line,

2.SQL execution time.

If there are two sqls that want to modify the same piece of data in the same table at the same time, mysql handles this situation: one is table locking (MyISAM storage engine), and the other is row locking (InnoDB storage engine).

Table lock means that other operations cannot operate on this table, and must wait until the current operation on the table is completed. Line lock is the same, other sql must wait for this data operation is finished, others can operate on this data.

If there is too much data and the execution time is too long, the waiting time will be longer, which is why we want to divide the table.

Terms of sub-library and sub-table:

Read and write separation: different databases, synchronization of the same data, respectively, only responsible for data reading and writing;

Partition: Specify partition column expression, split records into different areas (must be the same server, can be different hard disks), the application seems to be the same table, no change;

Sub-database: multiple data tables of a system stored in multiple database instances;

Sub-table: For a two-dimensional data table with multiple rows (records) and columns (fields), there are two situations:

Vertical sub-tables: Vertical segmentation, different sub-tables store different fields, and can split out fields that are not commonly used or have large capacity or different services;

Horizontal sub-table (most complex): horizontal segmentation, according to a specific fragmentation algorithm, different sub-tables store different records.

In actual production, the usual evolution process is: single library single table-> single library multi-table-> multi-library multi-table; partition-> sub-table-> sub-library (vertical sub-library-horizontal sub-library-read-write separation)

Single database and single table

Single database single table is the most common database design, for example, there is an order table (order) placed in the database, all orders can be found in the order table.

Single database with multiple tables

As the number of orders increases, the amount of data in the order table will increase. When the amount of data reaches a certain level, the query of the order table will slow down, thus affecting the performance of the entire DB.

In addition, as the requirements iterate, if you add a column, mysql locks the table, and all read and write operations can only wait, there is no other way.

At this time, you can split the order horizontally to generate multiple order tables with exactly the same table structure. For example: order_01, order_02…,order_n, then order_01+order_02+order_n is a complete order data.

This horizontal segmentation, simple practices such as:

Divided by quantity, there is a first table for 1 ~ 1000, and a second table for 1001 ~ 2000;

Split by time, for example: the first table exists in January 2019, the second table exists in February 2019; you can also split according to the hash value of id, and so on

multi-database multi-table

With the increase of data volume, the hardware storage of a single database is not enough, and with the increase of query volume, a single database server has no way to support it. At this point, horizontal differentiation of the database is required.

For example, by region, a province in a physical database and so on

A New Problem Introduced after Dividing Database and Table

1. Distributed transaction problem

After doing vertical or horizontal library division, it will inevitably involve the problem of cross-database execution SQL, which will lead to the long-standing problem of the Internet community-"distributed transactions." So how do we solve this problem?

Using distributed transaction middleware

MySQL uses its own cross-library transaction consistency scheme (XA), but the performance is about 10 times slower than that of a single library.

Can you avoid cross-library operations (such as putting users and goods in the same library)

2. Cross-library join problems

After the database is divided into tables, the association operation between tables will be restricted, and it is impossible to join tables located in different databases or tables with different granularity. As a result, it may take multiple queries to complete the business that can be completed in one query.

So how do we solve this problem?

Simple solution:

Global tables: basic data, all libraries are copied.

Field redundancy: Redundant fields that need to be joined in each table, so that some fields do not need to be joined to query.

System layer assembly: The application side first queries all the review conditions separately, and then assembles them at the application side, similar to a mapreduce process (more complex).

3. The problem of horizontal expansion

When we use hash modulo to divide tables, we may need to dynamically add tables for increasing data volume, and we need to consider the problem of data migration at this time.

The original use is hash after the modulus of 8, then, the data is evenly divided in 8 tables (library).

If 8 tables are not enough, we have to expand to 16 tables. At this time, we hash the modulus of 16. The new data is no problem, and the old data will be corrupted.

If the hash is 9, then we will query Table 1 after taking modulo 8; however, now we are taking modulo 16, then we will query Table 9, and this data does not exist in Table 9, so we will not find the data.

4. The Problem of Combining and Sorting Results

Because we store the data in different libraries and tables, when we query the specified data list, the data comes from different sub-libraries or sub-tables, which will inevitably lead to the problem of merging and sorting the result sets.

If sorting, merging, and so on are required for each query, performance will definitely suffer greatly.

At this point, the study of "what are the advantages and disadvantages of database sub-table" is over, hoping to solve everyone's doubts. Theory and practice can better match to help you learn, go and try it! If you want to continue learning more relevant knowledge, please continue to pay attention to the website, Xiaobian will continue to strive to bring more practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report