Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Introduction and overall architecture of TiDB

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

-it is well described and needs to be verified.

Introduction to TiDB

TiDB is an open source distributed NewSQL database designed by PingCAP Company inspired by Google Spanner / F1 papers.

TiDB has the following core NewSQL features:

SQL supports (TiDB is MySQL compatible) horizontal elastic scaling (throughput can be linearly scalable) distributed transactions ensure strong consistency of data across data centers to ensure failure self-recovery, high availability of massive data, high concurrency of real-time writes and real-time queries (HTAP mixed load)

TiDB is designed for 100% OLTP scenarios and 80% OLAP scenarios, and more complex OLAP analysis can be done through TiSpark projects.

TiDB is not intrusive to the business and can gracefully replace the traditional Sharding solutions such as database middleware, database sub-database and sub-table. At the same time, it also allows developers and operators to focus on business development without paying attention to the details of the database Scale, which greatly improves the productivity of research and development.

The overall architecture of TiDB

To gain an in-depth understanding of the horizontal scalability and high availability features of TiDB, you first need to understand the overall architecture of TiDB.

The TiDB cluster is mainly divided into three components:

TiDB Server

TiDB Server is responsible for receiving SQL requests, processing SQL-related logic, finding the TiKV address that stores the data needed for calculation through PD, interacting with TiKV to get the data, and finally returning the result. TiDB Server is stateless, does not store data itself, is only responsible for computing, can be expanded infinitely, and can provide a unified access address through load balancer components such as LVS, HAProxy or F5.

PD Server

Placement Driver (abbreviated as PD) is the management module of the whole cluster, which has three main tasks: one is to store the meta-information of the cluster (which TiKV node a certain Key is stored in); the second is to schedule and load balance the TiKV cluster (such as data migration, Raft group leader migration, etc.); the third is to allocate a globally unique and increasing transaction ID.

PD is a cluster, and an odd number of nodes need to be deployed. Generally, it is recommended to deploy at least 3 nodes online.

TiKV Server

TiKV Server is responsible for storing data. From an external point of view, TiKV is a distributed Key-Value storage engine that provides transactions. The basic unit of data storage is Region. Each Region is responsible for storing data of one Key Range (the left-closed and right-open interval from StartKey to EndKey), and each TiKV node is responsible for multiple Region. TiKV uses Raft protocol for replication to maintain data consistency and disaster recovery. Replicas are managed in units of Region, and multiple Region on different nodes form a Raft Group, which is a copy of each other. The load balancing of data among multiple TiKV is scheduled by PD, which is also scheduled on a per-Region basis.

Horizontal expansion of core features

Unlimited horizontal expansion is a major feature of TiDB. The horizontal expansion here includes two aspects: computing power and storage capacity. TiDB Server is responsible for handling SQL requests. As the business grows, you can simply add TiDB Server nodes to improve the overall processing capacity and provide higher throughput. TiKV is responsible for storing data, and as the amount of data grows, more TiKV Server nodes can be deployed to solve the problem of data Scale. PD schedules TiKV nodes in Region units to migrate part of the data to the newly added nodes. Therefore, in the early days of the business, only a small number of service instances can be deployed (it is recommended to deploy at least 3 TiKV and 3 PD,2 TiDB), and as the business volume grows, add TiKV or TiDB instances as needed.

High availability

High availability is another feature of TiDB. All three components of TiDB/TiKV/PD can tolerate partial instance failures without affecting the availability of the entire cluster. The availability of these three components, the consequences of a single instance failure, and how to recover are described below.

TiDB

TiDB is stateless. It is recommended to deploy at least two instances. The front end provides services through load balancer components. When a single instance expires, it will affect the ongoing Session on this instance. From the application's point of view, a single request will fail. After reconnecting, you can continue to get the service. After a single instance fails, you can restart the instance or deploy a new instance.

PD

PD is a cluster that maintains data consistency through the Raft protocol. If a single instance fails, if the instance is not the leader of Raft, the service will not be affected at all; if the instance is the leader of Raft, a new Raft leader will be re-selected and the service will be automatically restored. PD is unable to provide services during the election process, which takes about three seconds. It is recommended to deploy at least three PD instances. After a single instance expires, restart the instance or add a new instance.

TiKV

TiKV is a cluster that maintains data consistency through the Raft protocol (the number of replicas is configurable and three replicas are kept by default), and load balancing is scheduled through PD. When a single node fails, all Region stored on that node is affected. For the Leader node in Region, the service will be interrupted and wait for re-election; for the Follower node in Region, the service will not be affected. When a TiKV node fails and cannot be recovered for a period of time (the default is 10 minutes), PD migrates the data on it to other TiKV nodes.

-- end [Tony.Tang] 2018.3.8

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report