How to analyze the TiDB of NewSQL Database 04/23 Update SLTechnology News&Howtos

How to analyze the TiDB of NewSQL Database

2025-04-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article is to share with you about how to carry out the analysis of NewSQL database TiDB. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article. Without saying much, follow the editor to have a look.

The following is about NewSQL-TiDB, a rising star in the database field. Due to the fast update and release speed of TiDB, there will be differences between the article and the latest version.

Background of TiDB birth

At present, the representatives of RDBMS are Oracle, MySQL and PostgreSQL. The traditional relational database has a long history and has a high "generation" in the database field, and it is widely used in various industries. However, there are some problems in this kind of database, such as the limitation of its own capacity, RDBMS is mostly local storage or shared storage. With the continuous increase of business volume, capacity has gradually become a bottleneck. At this time, DBA will alleviate the capacity problem through multiple database table sharding. A large number of sub-databases and tables not only consume a lot of manpower, but also make the business access to the database routing logic become complex. In addition, the scalability of RDBMS is relatively poor, usually the cost of cluster expansion and reduction is high, and does not meet the distributed transactions.

The representatives of NoSQL databases are Hbase, Redis, MongoDB, Cassandra and so on. This kind of database solves the problem of poor scalability of RDBMS, and it is much more convenient to expand the capacity of the cluster. However, because the storage mode is multiple KV storage, the compatibility with SQL is greatly reduced. For NoSQL databases, it can only meet the characteristics of some distributed transactions.

The representatives in the field of NewSQL are Google's spanner and F1, which claim to be able to achieve global data center disaster recovery and fully meet the ACID of distributed transactions, but can only be used on the Google cloud.

TiDB was born in the background, but also made up for the domestic vacancy in the field of NewSQL. It has been more than 3 years since TiDB wrote the first line of code in May 2015. it has been released dozens of times, and the version iteration is very fast. The latest version is 2.0.6, and the number of likes on GitLab has exceeded 14000.

The overall architecture of TiDB

TiDB can be divided into three types of nodes: PD Server, TiDB Server, TiKV Server.

PD Server is responsible for storing the metadata of the cluster, assigning a global transaction ID to each transaction, and is responsible for scheduling and load balancing the TiKV cluster data.

TiDB Server is responsible for receiving the user's request, parsing it into the execution plan, addressing the data through PD Server, and then interacting with the TiKV Server node to query.

TiKV Server is responsible for storing cluster data.

When the Client submits the task, it will be forwarded through the LB layer and submitted to the TiDB Server cluster. PD Server will assign a global transaction ID to each transaction, and then TiDB Server will parse the application into a specific execution plan, obtain the data storage address from the PD cluster, and query through interaction with the TiKV Server node.

TiDB capability characteristics

Computing power: TiDB Server itself is stateless, which means that when computing power becomes a bottleneck, the machine can be expanded directly and is transparent to users. In theory, there is no upper limit on the number of TiDB Server.

Storage capacity: TiKV Server is usually 3 +, and TiDB defaults to 3 copies of data, which is similar to HDFS, but the data on TiKV Server is replicated by Raft protocol. The data on TiKV Server is carried out in units of Region, and unified scheduling is carried out by PD Server cluster, which is similar to Region scheduling of HBASE.

High availability of TiDB

Each role of TiDB is highly available, and the downtime of a single node does not affect the entire cluster. There will be multiple TiDB Server, and because it is stateless, even if there is an unexpected downtime, the Applcation will retry and connect to other nodes. The PD Server is usually several 2n+1, and the election is carried out through the Raft protocol. After the Leader outage, the Follower becomes the Leader through the election, and continues to complete the work. The data storage format in each TiKV node is KV structure, hash to a Region by Key-Range, each Region will have two additional copies, distributed to the impassable nodes.

Compatible with MySQL

TiDB is basically compatible with MySQL, and when users use it, they can transparently switch from MySQL to TiDB, but the back end of the "new MySQL" is storage "unlimited" and is no longer subject to the disk capacity of Local. When using TiDB in operation and maintenance, you can also hang it as a slave library into the MySQL master-slave architecture.

Efficient storage scheme

As mentioned above, the data format stored in the TiKV cluster is KV. In TiDB, the data is not stored directly in HDD/SSD, but a localized storage scheme at the TB level is implemented through RocksDB. The architecture of RocksDB is no longer discussed, so you can search the relevant documents if you are interested. It is emphasized that RocksDB, like HBASE, uses the LSM tree as the storage scheme to avoid a large number of random reads and writes caused by the expansion of leaf nodes in the B+ tree. From where to improve the overall throughput.

TIDB monitoring

In TiDB, the open source Prometheus is selected as the monitoring of the entire cluster. On each node, the data of all nodes is collected and reported through the Multiple role and pushed to PushGateWay. PushGateWay receives all data from all Client Push, Prometheus Server pulls data from GateWay regularly, and the whole monitoring uses Grafana for visualization and monitoring queries.

As a new generation of NewSQL database, TiDB has gradually gained a firm foothold in the database field, combined with the outstanding characteristics of Etcd/MySQL/HDFS/HBase/Spark and other technologies. With the large-scale promotion of TiDB, it will gradually weaken the boundaries of OLTP/OLAP and simplify the current jumbled ETL process, causing a new round of technology tide. In a word, TiDB, the future is promising and the future is promising.

The above is how to carry out the analysis of NewSQL database TiDB. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.