What is the design principle of data distribution in OLTP scenario 07/06 Update SLTechnology News&Howtos

What is the design principle of data distribution in OLTP scenario

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

What this article shares with you is about the principles of data distributed design in OLTP scenarios. The editor thinks it is very practical, so I share it with you. I hope you can get something after reading this article. Let's take a look at it with the editor.

Preface

In recent years, I have done a lot of work on distributed data architecture in OLTP (online Trading system) scenario. During the epidemic, I have just sorted out some designs and practices in this area. In order to avoid being too long, this paper is divided into design part and technical part. The design part mainly focuses on the theory and method of data separation, as well as some principles and experience. The technical part will mainly introduce the design and use of sub-database and table middleware, as well as how to build a complete distributed data service platform.

Generally speaking, in a distributed architecture, the application layer is easy to be distributed, because it is often stateless (or stateless by transferring data to DB, cache, MQ, etc.). You only need to add a load balancer at the traffic entrance, that is, in front of the application (for example, Nginx, HAProxy, F5), which is also available in large monomer architectures. So generally speaking, when we talk about distributed architecture, an important part is to do the distribution of data.

Traditional monolithic centralized architecture

The distribution of data is not as simple as the application, because the data of each node may be different, and it is necessary to route, solve the consistency of multiple copies, and even write conflicts. Although the implementation scheme is complex, the distribution of data is essentially two simple ideas: replication and fragmentation. Replication technology is also very common in traditional relational databases, which is mainly used for active / standby, double work, such as MySQL Replication, Oracle DataGuard and so on. Shards also have corresponding products in the database. For example, MySQL Fabric, Oracle Sharding, but compared with replication, the sharding scheme corresponding to these database manufacturers has not been widely accepted by the public.

There are often built-in sharding mechanisms in NewSQL databases, and they are all based on paxos and raft algorithms to ensure replication consistency. For the comparison and selection of sub-database and sub-table and NewSQL schemes, please see my previous article "sub-database and sub-table vs NewSQL database".

In the OLTP scenario, the idea of replication and fragmentation is applied to the traditional relational database, there are two more well-known names, sub-database sub-table and read-write separation.

Sub-database and sub-table, which is to split the original single database table, is a main way to realize the transformation of distributed architecture based on traditional relational database, so the first question is:

Why split it? When does it need to be split?

Capacity, performance, scale-out, microservices

There is an upper limit bottleneck in the storage, CPU, memory and other resources of the stand-alone database. When the amount of data and access reaches a certain order of magnitude, the performance will decline sharply, that is to say, the vertical expansion through scale up is an upper limit, and the cost is high.

If you want to achieve scale out scale-out, you need to split the data from the original table into multiple physical library tables (split horizontally).

In addition, if it is a micro-service architecture, the split service belongs to different systems, corresponding to different databases, in fact, it has been vertically split.

What are the ways of splitting?

1. Vertical split

Vertical splitting is generally closer to the business split, which is most commonly used when doing micro-services. It will be split according to DDD (domain driven design) technology or business capabilities. Generally, the bounded context is determined, and the split rules are relatively clear.

This method is less intrusive to applications, and often only needs to configure their own independent databases (which may be physical machines, or just different real columns). At most, they can do a data access layer selected by multiple data sources.

Another vertical split scenario is that due to hot and cold data, the access frequency of different columns of the same row of data varies greatly, or some large fields such as Text and Blob affect the efficiency of reading and writing, and these columns will also be split into different tables. This approach is generally uncommon and is often considered when making performance optimizations.

Vertical split

Advantages of vertical split:

After the split, the business is clear and the split rules are clear. Often according to the system or transaction.

It is easy to integrate or expand between systems

Simple data maintenance and low architecture complexity

Disadvantages of vertical split:

Some business tables cannot be join, so they can only be solved through interfaces at the application layer.

Due to the different limitations of each business, there is a performance bottleneck of a single database.

Distributed transaction scenarios are often generated.

Because vertical splitting divides tables into different databases according to business classification, some business tables are too large and there are bottlenecks in reading, writing and storage of a single database, so horizontal splitting is needed to solve the problem.

2. Horizontal split

Horizontal split is more technical, the data of one table is distributed to multiple databases and tables, the specific ways can be divided into: only sub-database, only sub-table, sub-database and sub-table. For example, order table, only divided into libraries (ds1.order, ds2.order... Dsk.order), only sub-table (ds.order_0, ds.order_1 … Ds.order_n), sub-library and sub-table (ds1.order_0, ds2.order_1 … Dsk.order_n).

Horizontal split

Advantages of horizontal split:

If the operation data is distributed in the same database, you can support complex SQL such as join, subquery and so on.

Solves the performance bottleneck of a single library and supports scale-out

Because the application is not split, if there is a distributed data access layer, the application transformation is less.

Disadvantages of horizontal split:

The split rules and the number of sub-databases and tables need to be carefully designed.

If multiple libraries are involved, there will be a distributed transaction scenario

The workload of data migration is heavy when data capacity is expanded.

Cross-library join often needs application implementation, and its performance is poor.

Data merging, aggregation, paging and so on cannot be directly supported by the database.

There is a partition table in the database. Do you want to divide the database into tables?

The partitioned tables of traditional relational databases essentially share cpu and memory, so they are still faced with the problem of scale up, and the partitioning keys supported by partitioned tables are often not flexible. However, some new NewSQL distributed databases, such as OceanBase's partition tables, are scattered on different storage nodes, thus avoiding the bottleneck of stand-alone performance.

Split specific steps

1. Determine the split mode

Choose the appropriate split method according to the business characteristics, which is generally used in combination.

1) Vertical split

Scenario: field length, different access frequency, field table, micro-service

Note: tables that need to operate in the same transaction should not be split as much as possible

2) horizontal split

Scenario: large amount of data, exceeding the performance of single table and single database

Note: whether there are cross-database transactions and whether there are scenarios in which tables are manipulated by non-sharding keys, which will involve database table scanning transactions

2. Determine the split field

1) split tables and fields vertically

Split according to the functional module can be directly by the table, if it is to split some columns, you need to add associated columns or even redundant columns.

2) split fields horizontally

Make sure that all split tables have sharding keys, mostly primary keys or unique indexes, and these columns need to contain shard information. If the sharding information is not included in the request, a global routing table is required.

3. Determine the split rules

1) range Range

Suitable for orderly increasing business fields according to certain rules, such as date, pipelined ID, etc., such as 0-9999-> Library 110000 1999-> Library 2. ; 20150101-20161231-> Library 1: 20170101-20171231-> Library 2.

This method naturally supports horizontal expansion, which is convenient for hot and cold separation and archiving, and for capacity expansion as needed, but the load is easy to be unbalanced. If the pressure on a single database is high, data migration is also needed.

2) Hash Hash

The data distribution is relatively balanced, and routes are generally calculated through the number of mod libraries / tables, which is essentially a kind of pre-allocation, so data migration is needed during capacity expansion, usually with consistent hashing and multiple expansion methods.

3) apply customization

By applying custom routing rules, the database table serial number corresponding to the shard ID is configured, which can be used through routing tables, configuration files or other custom algorithms. This method has the highest flexibility and is easy to change dynamically.

In our project, methods 1, 2 and 3 are all used.

4. Determine the number of splits

1) assume that the target amount of data is T (estimated according to business development needs)

2) the amount of data in a single table is recommended P (for example, MySQL is 500w), and the number of sub-tables = Tmax P

3) under the current configuration of typical business scenarios, the corresponding upper limit of data capacity L under the premise of stable performance of a single database.

The performance of a single library can be evaluated by determining its bottleneck state based on system indicators such as cpu, disk IO (disk utilization 100% iowait appears and gradually increasing), transaction tps stability (significant tps fluctuations), and so on.

4) the number of sub-libraries = Tbank L

The number of database tables is related to future expansion and operation and maintenance requirements, and should not be too much or too little. The above is mainly calculated from the point of view of capacity. In the actual scenario, it is also necessary to consider factors such as hardware cost budget and data cleaning and archiving strategy.

How to expand capacity after split?

1. Vertical expansion

After vertical split, if the database of an application is under too much pressure, it can be expanded vertically by increasing its resource configuration (CPU, memory, PCIE).

2. Horizontal expansion

Under horizontal split, the capacity can be expanded by adding database servers. This method requires data migration. If the hash is consistent, the data of the nearest node is migrated. If the capacity is multiplied, half of the data of all nodes need to be migrated.

Although the amount of data migrated by the consistent hash mode is small, it is easy to cause uneven cold and hot data, so the doubling expansion method adopted in our project is to divide the tables into 128 tables in advance, for example, these tables are evenly distributed among 4 database servers at the beginning of the project, and the capacity is expanded to 8 databases as the business increases. You only need to move half the number of tables from each of the four databases to the new four servers, and then modify the SQL route.

Double capacity expansion: in response to the overall data volume growth, the physical machine after the expansion is twice as large as the original.

If there is hot data pressure on a single database, only part of the data in the database can be migrated out of the newly expanded database.

Capacity expansion of a single database: to cope with the rapid growth of a slice data to an independent physical machine

Problems after split

The problem of introducing distributed transactions

The problem of cross-library Join

Multi-database merge sorting paging problem

SQL routing, rewriting issu

Multi-data source management problem

Operational problems such as data summary and query caused by multi-dimensional split

Solution:

Avoid distributed transactions, cross-node join, and sorting scenarios as much as possible

Avoid using database distributed transactions and provide flexible transaction support (idempotents, corrections, reliability messages, TCC)

Solve join problem by application layer

Provide distributed data access layer

Summary database, secondary index database, small table broadcast

The distributed data access layer is introduced in detail in the technical section.

Separation of reading and writing

In the actual business scenario, the frequency of reading and writing to the database is different. Some are writing more and reading less, such as transaction pipelining tables; some are balanced reading and writing, such as order tables; others are reading more and writing less, such as customer, information, configuration and other information tables.

Data sharding solves the bottleneck of single-point performance and the ability of scale-out, which is suitable for scenarios with high write pressure. In such scenarios with more reading and less writing, if the capacity of a single database can be satisfied, the problem of high reading pressure can be solved through the separation of reading and writing. Specifically, the write operation can be routed to the master library, and the read operation can be scattered in the master library and each slave library according to the weight, computer room and so on.

Separation of reading and writing

There are a few points to pay attention to in read-write separation mode:

1) Master-slave delay. There is a certain delay in reading data from the slave database than that of the master database (usually at the millisecond level, and may be at the second level when writing pressure is high). Therefore, when choosing this method, the business should allow a certain data delay. For example, this method is generally used in external query transactions.

2) in the same transaction, the data cannot be read from the library, because it may be due to the delay of reading dirty data, which violates the consistency of the transaction, so it must be read in the main library. In the actual development, the data access layer can automatically judge whether it must be read in the main database according to whether to turn off transaction auto-commit.

3) for query transactions with low data delay tolerance, you can separately encapsulate an interface for querying from the master database during development, or add a "strong consistency" flag to the input parameters. When the transaction is implemented, you can choose whether to read from the master database or from the slave database according to this flag.

In actual projects, both sub-database sub-table and read-write separation are used in scenarios, but in general, avoid using complex solutions such as sub-library sub-table + read-write separation, because the pressure of reading and writing after sub-database sub-table will not be too great.

Principle and experience

Data distribution is a systematic project, which needs to be comprehensively considered from many aspects, such as domain modeling, scene division, data access, data migration and expansion, etc., and the overall design should be done before implementation. Here are some of our design principles and experiences:

1) solve the problem with a simple solution. Do not split as much as possible, do not split for the sake of distribution. Read-write separation can solve the problem, so it doesn't divide the database into tables.

2) sharding must select the appropriate sharding rules (to ensure that 90% of the transactions will not cross sharding), sort out all scenarios, and plan in advance before implementation.

3) the data access layer should be powerful in design, but it must be clear about the use scenario and avoid mindless abuse. For example, although the data access middleware in our project supports distributed transaction XA, it is generally not recommended; DDL is supported, but online transaction is prohibited; multi-database chained transaction commit is supported, but only strict single-database transaction is supported by default.

4) formulate application development norms, clarify the restrictions and requirements for the use of SQL, and SQL should be as simple as possible. For example, our project uses MySQL and is deployed on PC Server, and its stand-alone performance is much worse than that of DB2 and Oracle on minicomputers, so it is forbidden to use triggers, foreign keys, join,SQL operations must carry indexes and splits (data access layer will also check), primary keys must be self-increasing, and so on.

5) try to use flexible transactions to solve cross-database and cross-system transaction problems. If you can use MQ for ultimate consistency, don't use Saga or TCC.

These are the principles of distributed data design in OLTP scenarios. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.