Depth | the storage cost of real-time history database is amazing, how to break it? 04/26 Update SLTechnology News&Howtos

Depth | the storage cost of real-time history database is amazing, how to break it?

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Author: Hu Dao Ali Yun operation and maintenance expert

Zhouji Aliyun solution architect

Background in today's digital age, with the rapid development of business, the amount of data generated every day will be an astonishing amount, and the cost of database storage will become higher and higher. The usual practice is to archive historical data and migrate long-term unused data to cheap storage devices stored in file form, such as Aliyun OSS or Aliyun database DBS service. However, in the application scenarios of some core businesses, there are still real-time, low-frequency queries and even updates for the "old" data a few months or even a few years ago, such as the historical order query of Taobao / Tmall. Enterprise-level office software nails chat information query of a few years ago, historical logistics order details of Cainiao massive logistics, and so on.

If the query is restored from the history backup at this time, the query time will be in days and the acceptability will be 0

If the historical data of these low-frequency but real-time query requirements are stored in the same set of distributed database cluster as the recent active storage costs will be huge, resulting in costs far greater than benefits. For example, the amount of data of nail chat messages is close to 50PB after highly compressed. It is difficult to imagine how much capital expenditure and performance challenges these data will bring without compression. With the increasing amount of data, even if distributed storage is done for the data, the performance of a single instance will decline sharply when the capacity of a single instance exceeds about 5T, which will affect the query performance of the recent active data, and it is very difficult to drag down the operation and maintenance of the whole cluster. for example, it is difficult to send a table data structure change operation for massive data, it is difficult to imagine how long it will take to complete 1 real-time history database scenario requirement analysis

Through the above analysis, whether cold backup or online historical data mixed storage on the same physical table is not desirable, the general real-time query historical database scenarios, generally need to have the following key features

The cost is controllable, and the storage cost of historical data cannot accept the same linear growth as the online database.

Real-time query, historical data query RT should be almost the same as the online active database.

The query frequency is lower, generally speaking, the older the data is, the lower the query frequency is.

Unified query entry, whether it is active data or historical data, the query entry is consistent

The cost of transformation needs to be as low as possible, and it is best not to make any changes to the application code. It can be considered that the history library is completely transparent to program developers.

There may be a need for historical data updates

The data scale is relatively large, generally above 100TB.

Introduction to 2X-Engine engine

About 01X-Engine X-Engine is an online transaction processing OLTP (On-Line Transaction Processing) database storage engine developed by Ali Cloud Database Product Division. As one of the storage engines of the self-developed database PolarDB, it has been widely used in many business systems within Ali Group, including transaction history database, nail history database and other core applications, greatly reducing business costs, and as a key database technology promoted by Singles' Day, it has survived the impact of hundreds of times the usual traffic.

Unlike traditional InnoDB engines, X-Engine uses a tiered storage architecture (LSM-Tree). Tiered storage has two significant advantages:

The hotspot dataset that needs to be indexed is smaller and has higher write performance.

The underlying persistent data pages are read-only, the data pages are in a compact storage format, and are compressed by default, resulting in lower storage costs.

Compared with the InnoDB engine, according to the data characteristics, the storage space using X-Engine can be reduced to 10% / 50%. We tested the storage space efficiency of X-Engine on the two datasets of the famous Link-Bench and Alibaba insider trading business. In the test, compared with the open compression InnoDB engine, the X-Engine has 2 times the space advantage, while compared with the non-open compression InnoDB,X-Engine has a 3-5 times advantage. 02

Real-time history library scheme, why choose X-Engine

1. Usually we default that MySQL is the most popular open source database today, and it is likely to be the first choice for online core database clusters. Compared with other highly compressed storage engines, the introduction of X-Engine does not require any SQL code modification, and supports transactions, the access cost is the lowest, and the learning cost is almost zero.

two。 Write performance is stronger, and X-Engine has a performance improvement of more than 10 times compared to Rocksdb, which is the same LSM-tree architecture.

3. Data reuse technology is introduced into the storage layer to optimize the performance of Compaction, reduce the impact of Compaction actions on system resources in traditional LSM-tree architecture, and keep the system performance stable.

4. Multi-level Cache is introduced, combined with Cach backfill and prefetch mechanism, fine access mechanism and cache technology are used to make up for the reading performance deficiency of the traditional LSM-tree engine. The point query ability of X-Engine is almost the same as that of Innodb.

The following figure shows the comparison between X-Engine and mainstream historical data storage schemes 3

Design and implementation of Real-time History Database Architecture

Overall architectural thinking

Based on the introduction of real-time history database and X-Engine above, Aliyun database team introduced X-Engine engine as the core of historical data storage, ecological tool DTS as online / historical data transfer channel, and DMS as a complete "real-time online-history database" scheme for risk-free deletion of historical data. The specific implementation may be different according to different business scenarios and customer needs. We provide the concrete implementation of a variety of real-time history library schemes. The principal architecture diagram is as follows, and the core ideas are as follows:

As the core engine of OLTP online library, the time-tested Innodb engine mainly handles high-frequency query / update requests and meets the business requirements of high concurrency, high performance and strong range query for online active data.

As the core engine of history library, X-Engine, developed by Alibaba database team, mainly responds to historical data storage / query / update requests, and meets the business requirements of different cold and hot data frequency, low storage cost and high performance (scope query may have limited performance).

Unified DB access layer forwards requests to different storage engines according to the set business time attributes. For cross-engine access in special scenarios, aggregate presentation is made in the application layer

Online-Historical data migrates and deletes expired data through ecosystem tools provided by Aliyun to ensure that the link is more stable and reliable

On-line library / history library split scheme

Generally speaking, the amount of data in scenarios that need to use a real-time history database is too large to be stored by a single host. The online database may be a number of RDS split horizontally or vertically according to the business, or it may be a large DRDS cluster. In order to ensure the performance of the online database as much as possible, it is recommended to decouple and decouple the online database / history database cluster storage of all data through the DTS link to open the online library and history database. Real-time synchronous DTS link filtering Delete operation can be used directly to use the new version of DMS configuration history data to delete the source to DRDS cluster an on a regular basis. The data synchronization link uses RDS multiple DTS links to get through the underlying RDS nodes, and has strong synchronization performance.

A large number of RDS can support batch creation and configuration of API

Better link stability

It is necessary to ensure that the number of source and destination database tables and data routing rules are consistent.

b. Data synchronization link to DRDS only needs to configure one DTS link to save money.

Poor data synchronization performance

Source-side DRDS expansion will affect DTS synchronization link

The number of instances and data routing rules on the source side and destination side can be configured freely.

The source end is multiple RDSa. The target end is multiple RDS business codes without any modification.

There is a risk in the disk capacity of the history library node in the later stage of operation.

b. The target end is the DRDS cluster

It may involve the lightweight modification of the business code, and the performance of the subtable key can not be guaranteed if the target end does not leave the library.

Multiple online library businesses can be merged into a set of history database cluster, and the architecture is more concise.

DTS links are also divided into RDS and DRDS.

Data synchronization link goes to RDS

The data synchronization link uses DRDS03 and instance mixed storage engine solution. The online database / history database split scheme is relatively complex. RDS supports the same instance mixed storage engine. For scenarios where the total amount of data is not very large, you can consider the mixed use of Innodb&X-Engine engine under the same instance using DMS-- > data factory-> data orchestration feature to easily achieve data flow and expired data deletion within the same instance. The architecture diagram is shown below.

The implementation is simple and flexible

Mixed storage engine, it is difficult to take into account the performance of both in database kernel parameter optimization.

Historical data compact phase may cause performance jitter to the entire instance

The database or table under the same instance cannot be renamed, which involves the transformation of lightweight business.

04DTS enables online / Historical data transfer DTS not only supports full-volume-incremental synchronization and data synchronization between different database products, but also supports data synchronization between different database products. In online / history library solutions, DTS's powerful "conditional filtering" function is a very important link. It is very convenient to filter Delete operations by configuring DTS tasks, and custom data synchronization strategies can be achieved with two mouse clicks.

DMS enables on-line database expired data deletion

The deletion of expired data in the online database should not only ensure the efficiency of deletion, but also ensure that the online database will not cause performance jitter during the deletion process. The new version of DMS supports the data change task of creating "historical data cleaning". Through this task, you can easily complete the following tasks

Delete historical data periodically, specify scheduling time and scheduling duration at one time

Split large transactions to reduce excessive table locking time during transaction execution and avoid master / standby delay

Cleanup encountered an abnormal interruption and can be retried

Support to view the running status of the task and the cause analysis of failure

Simplicity in configuration

Thoughts on cleaning up expired data

If you do not use the DMS ecological tool, you can also delete expired data by yourself, but the implementation is more complex. The general design idea is to split the primary key of the table according to size to ensure that the "appropriate amount" of data is deleted at once, which not only ensures the efficiency of deletion but also does not affect the online service.

Policy for deleting historical data in the online database (assuming that the primary key is id, the data is saved for 180 days, and the time attribute is listed as date_col)

Initialize the numeric Y=select min (id) from $table_name

After the business trough, DELETE FROM $table_name WHERE date_col

< SUBDATE(CURDATE(),INTERVAL 180 DAY) and id >

= Y and id

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.