What is the most suitable architecture design for cloud database? 07/06 Update SLTechnology News&Howtos

What is the most suitable architecture design for cloud database?

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Distributed database technology has been developed for many years, but driven by applications and services, the architecture of distributed database has been constantly developing and evolving.

Open source financial-grade distributed database SequoiaDB, after 6 years of research and development, insists on building the database core engine from scratch. In the technical exploration, we choose the architecture and engine design that are more suitable for the cloud database scenario. This article will also be carried out in detail to introduce the current architecture and design concept of SequoiaDB.

SequoiaDB also recently completed round C financing led by Castrol Investment. The leading investor in this round is Castrol Investment, with QiMing Venture Partners and DCM as early investors. SequoiaDB Giant Sequoia Database has always adhered to technology-driven products, focused on building financial-level distributed databases, and became the database manufacturer selected for the first time in China's Gartner database report. At present, the total number of paid enterprise customers and community users of Giant Sequoia database is more than 1000, and it has launched the core production business of more than 50 top 500 banks, insurance, securities and other large financial institutions.

Multimodel multimode database engine

In the era of cloud computing and distribution, the traditional relational database serving a single structured data has also begun to develop continuously. Since IBM DB2 supported XML in 2007, more and more relational databases have begun to support semi-structured data such as XML and JSON. Therefore, Gartner believes that the development direction of database in the future is the era of multi-mode, and a mature database product needs to use distributed technology to support a variety of access methods except relational.

SequoiaDB is a typical Multi-Model database, which covers structured, semi-structured and unstructured data, and meets the needs of transaction, image storage and statistical analysis services.

Through its computing and storage separation architecture, SequoiaDB makes effective use of MySQL, SparkSQL and PGSQL parsing executors in the field of NewSQL structured data, perfectly supporting the HTAP mixed transaction analysis load of online transactions and offline analysis while maintaining 100% compatibility of industry standards. At the same time, SequoiaDB uses API to support semi-structured JSON data, and realizes the storage and access of unstructured data through compatible Posix file system and S3 interface.

SequoiaDB storage uses a dual-engine architecture to parse and store large file objects and data records in an optimal structure, and the upper layer is supplemented by unified transaction management, cluster control, synchronous replication, session management and other mechanisms to support the logical and physical separation of data and session, so as to maximize it to meet the needs of distributed management and mixed business load in the cloud era.

SequoiaDB released version 3.0 at the end of 2017. As you can see in its evolution path, each major version iteration of SequoiaDB has been greatly expanded and enhanced over previous versions. Among them, version 1.0, which was officially released in 2013, provides support for semi-structured data as a simple JSON database. In 2015, version 2. 0, SequoiaDB began to fully support object storage. Version 3.0, released until the end of 2017, provides perfect docking and 100% compatibility with MySQL, PGSQL and SparkSQL, and fully supports the distributed transaction processing capabilities of NewSQL.

The Development of SequoiaDB products

Computing-storage separation architecture

At present, the common distributed architecture in the industry includes two types: sub-library, sub-table and computing storage. Among them, the sub-library sub-table architecture is represented by the application of middleware segmentation or MyCat and other products. If we say that the sub-database sub-table architecture is based on the traditional database for simple upper-layer encapsulation, the real computing storage separation architecture means that both the SQL parsing and the underlying data storage can be flexibly extended.

At present, the most mainstream cloud database implementation in the industry (such as AWS's Aurora, Aliyun's PolarDB, etc.) is that by building the MySQL server directly on the underlying distributed high-performance storage, and through the customized standard SQL engine and the underlying data communication interface, the underlying distributed storage is loosely coupled with the upper SQL parsing executor, both of which can be dynamically scaled and scaled freely.

Computing (SQL)-schematic architecture of storage separation

The design idea of the computing and storage separation system is to deploy the computing and storage layers separately in a loosely coupled way, to seamlessly replace each module and component through standard interfaces or plug-ins, and to achieve free elastic scaling in both the computing layer and the storage layer. The architecture of MySQL and MariaDB can be said to represent the loosely coupled structure of computing and storage in relational databases. In MySQL 5.7 and previous versions, its SQL parsing engine communicated with the background data storage kernel through hundreds of C++ functions. Therefore, in the MySQL database, DBA can choose InnoDB, MyISAM, NDB, Memory, or even implement a set of database engine to interface with the front-end SQL parsing executor.

Detailed description of "Computing-Storage Separation" Architecture of distributed Database

One of the advantages of computing storage separation architecture is that users are free to choose transaction-oriented SQL parsers (such as MySQL or PGSQL) or statistical analysis-oriented execution engines (such as SparkSQL) according to their business characteristics. As we all know, using different ways of SQL optimization and execution, the database access performance may be thousands of times different. The core idea of computing storage separation is integrated storage at the data storage level, while the computing level makes effective use of the characteristics of each execution engine to select and optimize for different business scenarios.

SequoiaDB architecture schematic

At the same time, because the data storage layer is completely separated from the computing layer, users can make logical and physical isolation in the storage layer, and use different hardware to store the front-end business for high-frequency trading and the statistical analysis for high throughput, so as to ensure that they do not interfere with each other during multi-type data access, so as to truly achieve the multi-tenant and HTAP capabilities available in the production environment.

Thanks to the separate architecture of SequoiaDB 3.0, the entire database can access the same data through different interfaces by freely docking different execution engines. At the same time, SequoiaDB can specify two of the three copies of online business access through configuration, while the other is dedicated to SparkSQL for statistical analysis, so as to access the same data, and online applications are completely isolated from statistical services at the physical and hardware level.

Flexible Service isolation Partition under Computing-Storage Separation Architecture

For online trading business, because all distributed transactions, locks, indexing and other mechanisms are directly completed in the underlying distributed engine, the upper layer can achieve complete ACID using any SQL parser.

Elastic expansion

In the era of cloud computing, any application and middleware have already achieved dynamic expansion and reduction through micro-service architecture. For example, enterprises can lease AWS or Aliyun servers on a large scale before the Singles' Day peak, expanding the computing and processing power of applications dozens of times.

However, unlike applications, the flexibility of the data level is often the biggest constraint to application scalability. For example, an application can scale from 3 Tomcat servers to 30 without downtime in a day, but the underlying data sublibrary and table mechanism is almost impossible to easily increase or decrease the service nodes of the database.

Native distributed Architecture of SequoiaDB Storage engine

Through mechanisms such as consistent hashing, SequoiaDB makes the expansion and reduction of the underlying database completely online and transparent and unaware of the application. For pipelining businesses that need to store a large amount of data, SequoiaDB can even provide a "zero data migration" strategy to ensure that the system does not produce any background rebalancing operations that require a large number of Icano after adding nodes.

SequoiaDB can expand the storage capacity and computing power of the whole cluster horizontally and flexibly by increasing the number of data partitions and data nodes.

Fully compatible with MySQL

SequoiaDB provides full MySQL compatibility at the application level through the "compute-storage separation" architecture. SequoiaDB directly uses the MySQL Server downloaded from the MySQL official website to provide a SequoiaDB distributed storage engine plug-in parallel to InnoDB through its storage engine plug-in capabilities.

SequoiaDB makes full use of the MySQL database services that we have been used to for many years. For application developers and DBA, they do not need to learn any new knowledge and syntax to seamlessly migrate their applications from the traditional single point architecture to distributed databases. When switching from InnoDB storage engine to SequoiaDB distributed engine, all data partitioning mechanisms are completely transparent and zero-aware to upper-layer applications. At the same time, SequoiaDB also provides a variety of migration tools, including offline, online, real-time and other migration tools for users to choose in different scenarios.

Today, MySQL has been used by a large number of Internet and enterprise users. Compared with the need to reconstruct the library and table strategy of SQL parsers and executors, SequoiaDB's computing-storage separation architecture can maximize the reuse of the original skills of developers and DBA, while maintaining a close interaction with the MySQL community, and participate in the ecological construction of MySQL through its distributed storage capacity.

SequoiaDB is fully compatible with MySQL

Summary

Based on the Multimodel multimode data storage engine, the distributed engine and the complete compatibility of SQL layer to MySQL, PostgreSQL and SparkSQL are realized through the mainstream computing-storage separation architecture in the industry. This overall architecture design is believed to be the mainstream architecture design for the development of cloud data.

SequoiaDB is precisely the application of this architecture design, to achieve elastic expansion, multi-tenancy, HTAP support, full compatibility with MySQL and other capabilities, which also enables open source SequoiaDB to participate more closely in community construction, contributing to the development of China's database basic software and the growth of the MySQL community! Through this financing, Giant Sequoia Database will continue to invest in core R & D and technological innovation, based on the financial industry to cover other vertical markets, expand more enterprise-level application scenarios, and accelerate the pace of internationalization. turn Giant Sequoia Database into a world-class distributed database product!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.