Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is UCloud TiDB Service?

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/03 Report--

TiDB is an open source distributed relational database independently designed and developed by PingCAP. It is a hybrid distributed database product that supports both online transaction processing and online analytical processing (Hybrid Transactional and Analytical Processing, HTAP). It has important features such as horizontal capacity expansion or reduction, financial level high availability, real-time HTAP, cloud native distributed database, compatibility with MySQL 5.7protocol and MySQL ecology. The goal is to provide users with one-stop OLTP (Online Transactional Processing), OLAP (Online Analytical Processing), HTAP solutions. TiDB is suitable for various application scenarios such as high availability, strong consistency, large data scale, and so on.

UCloud publicly clouded TiDB and launched UCloud TiDB Service in August this year, and the current version of TiDB is 3.0.5. Compared with bare metal deployment, UCloud TiDB Service has no loss of performance and provides high availability across availability zones. Monitoring and Binlog have been modified and enhanced, so that users can get TiDB services that can be created with one click, pay on demand and scale up flexibly.

UCloud TiDB Service

Why is it called UCloud TiDB Service? Service is emphasized here because from the perspective of public cloud users, TiDB runs on the public cloud platform and is actually presented in the form of a service rather than a physical resource. UCloud TiDB Service is a Serverless-oriented distributed database service that supports native MySQL protocol, high performance, high availability across availability zones, and high scalability.

Compatible with native MySQL protocols

In most cases, you can easily migrate from MySQL to TiDB without modifying the code, and the MySQL cluster with sub-libraries and tables can also be migrated in real time through the TiDB tool.

High availability across availability zones

Although TiDB itself has a certain degree of high availability, general users do not have the conditions to deploy across availability zones. All components of UCloud TiDB Service are deployed across availability zones. With the multi-instance deployment capability of all TiDB modules and the cross-availability zone deployment capability of UCloud, UCloud TiDB Service can withstand availability zone-level failures.

Dynamic expansion

TiDB can scale horizontally, whether it is a computing node or a storage node. By simply adding new nodes, you can expand throughput or storage as needed, and easily cope with high concurrency and massive data scenarios.

Serverless

The product form of Serverless makes it easier and faster for users to use TiDB without paying attention to the underlying physical resources or the details of the underlying distributed deployment.

Pay on demand with low access cost

There is no need to specify CPU, memory, hard disk and other resources, users only need to pay according to the actual use of hard disk and storage, saving the early hardware cost.

Performance comparison

We did a test under the same physical configuration (Intel Xeon E5-2620 v4, DDR4_16GB_2400MHz x12, U.2_NVMe_3.2TB x2) and the same software deployment (TiDB x3, TiKV x3, PD x3) under sysbench 512 threads, 32 tables, 1000 lines. The performance comparison of deploying TiDB and UCloud TiDB Service on bare metal is shown in the following table:

The results show that all the indicators are basically the same. Compared with bare metal deployment, UCloud TiDB Service does not bring performance loss, and some indicators perform slightly better. Behind this, what does the UCloud public cloud backend do?

Build a distributed database PaaS platform

UCloud has built a PaaS platform for distributed databases (as shown above). In terms of management functions, the first part on the left has the resource management of physical machines, including resource allocation each time an instance is created and resource recovery after the instance is deleted. The second part is cluster deployment. A creation process first selects the appropriate physical machine to check whether the above resources are satisfied, and then allocates some specific resources later, and then performs the corresponding creation work. In this process, the TiDB cluster is created, the corresponding monitoring, LB layer, and deployment on the public cloud are all run in the user VPC, and the VPC network initialization needs to be done. The third part is cluster maintenance, for example, if there is an exception on a physical machine, all services should be migrated to other nodes. This mainly involves the work of migration, expansion and capacity reduction.

On the right is the monitoring alarm, which is mainly used for timely notification and alarm management of some abnormal situations, and operation analysis, which is the management of UCloud database operation. Backup management is responsible for database backup and recovery. Users can set more detailed backup strategies, such as when to back up and how to back up.

The native protocol is the original data flow of MySQL. Here we add a layer of load balancer with two main purposes: one is to unify IP addresses into one, so that users do not need to manage the switching of IP addresses; in addition, some controls are made for the transmission of public cloud services, mainly account and system control.

Deploy across availability zones to achieve high availability

TiDB is composed of distributed SQL layer (TiDB), distributed KV storage engine (TiKV) and PD module that manages the whole cluster. As shown in the figure, we deploy all components of TiDB across availability zones and provide a single highly available access address. The advantage of a single address is that users do not need to pay attention to multiple addresses or switch between addresses. Another advantage is that the whole disaster recovery process is completely transparent to the business, such as adding / shrinking a TiDB node or migrating to another machine. With the unified address virtual IP, the business does not have to consider the address at all, and all operations are completely transparent to the user.

Transformation of monitoring and control

TiDB itself uses Prometheus as the monitoring and performance indicator information collection solution, Grafana as a visual component to display, and Alertmanager to implement the alarm mechanism, but they are all single point deployment and do not have disaster recovery capability. We have retrofitted all three modules with high availability. As we all know, Grafana itself can not use TiDB to store metadata, we modified the Grafana source code, rewrote a large number of Multi-schema statements, and removed the operation of reducing the size of the field, thus supporting Grafana to use TiDB to store metadata.

As shown in the figure, there is a user business monitoring system with one LB and two Grafana nodes on the left. We connect to the Prometheus through LB to achieve remote high availability.

The transformation of Binlog

In such a user scenario, when TiDB data is imported into the existing big data cluster for data analysis, the log format output to Kafka is json, so that Flink consumption can be parsed. Because Binlog is in PB format, the driver currently provided only supports txt and mysql.

After our modification of Binlogdriver, Binlog supports outputting Json format and writing Json format logs to Kafka.

Quality improvement and Bug repair

During the period of building TiDB service, we have also found that we have solved some minor problems of native TiDB and improved the product quality in detail. Many of them have been improved and resolved in subsequent official versions, such as:

Drainer outputs statements in db.table format (fixed in 3.0)

Time zone change after TiDB upgrade to 2.1

Syncer does not handle SIGTERM (fixed) during the retry phase

Syncer can't decode set datatype (fixed)

Drainer only writes one partition, which leads to data skew. We can start multiple drainer, each drainer writing a DB.

Raft store single-threaded bottleneck (fixed in 3.0)

DDL was slow within 10 minutes when Binlog was turned on / off (fixed in 2.1.14).

There are also some problems in sentence understanding due to differences from MySQL native protocols, such as self-increasing of ID segments, connection interruption caused by GC time, limit on the number of transactions (no more than 6MB for a single KV entry, no more than 30w for a total number of transactions, no more than 100MB for total size), automatic retry for failure, etc. After a long period of polishing and accumulation within UCloud, these problems have reached a relatively mature and stable form.

TiDB management module

The product console opens the management module of TiDB to users, which is divided into four parts: backup management, recovery task, user management and Binlog synchronization. The details are as follows:

Backup management: you can choose whether to enable automatic backup policy when creating TiDB instances. Backup strategy includes backup time, automatic backup retention copies and automatic backup cycle. In addition to automatic backup, TiDB also provides manual backup options

Recovery task: TiDB currently supports restoring from backup files to a new TiDB instance. Users need to prepare the new instance in advance, and the recovery work will overwrite the new instance data.

User management: TiDB provides users with corresponding rights management, including adding users and initializing permissions, adjusting user rights, deleting non-root users, and so on.

Binlog synchronization: the incremental data of TiDB can be synchronized to other storage in real time. MySQL,TiDB is currently supported as the target storage.

Summary

It can be said that TiDB is a database created for the cloud. Under the premise of ensuring no loss of TiDB performance, UCloud TiDB Service provides TiDB to users in the form of services, which lowers the threshold for users, simplifies user management, and improves disaster recovery. In the future, UCloud will continue to work deeply with PingCAP officials to create more possibilities for cloud databases.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report