Tencent Technical Engineering | Tencent's only time series database: CTSDB decryption 07/04 Update SLTechnology News&Howtos

Tencent Technical Engineering | Tencent's only time series database: CTSDB decryption

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Background: with the rapid development of the Internet, the rapid expansion of big data and the rapid rise of the Internet of things, we find that most of the data in life and work are gradually related to time. For example, the real-time steps of the Wechat movement, the daily closing price of the stock, the device status of sharing bikes, and so on. In order to store these time-related data and actively embrace the era of the Internet of things, major enterprises have launched their own time series databases. This article will briefly introduce the basic concepts and application scenarios of time series database and Tencent time series database CTSDB.

What is a time series database

1. Time series data 1.1 what is time series data?

Before introducing the time series database, we should first understand the concept of "time series data": the data that records the state changes of the system and equipment in time order is called time series data (TimeSeries Data). It is ubiquitous in IT infrastructure, operation and maintenance monitoring system and Internet of things.

The time series data connect the isolated observations into a line from the time dimension, thus revealing the state changes of the software and hardware system. Isolated observations can not be called time series data, but if a large number of observations are strung together with timelines, we can study and analyze the trends and rules of observations. Its significance is reflected in two aspects:

The main results are as follows: (1) looking back from the timeline, the time series data can be made into reports to observe the changing rules of the data and capture anomalies. Here are two examples:

The picture below shows the number of shared bikes borrowed and returned per hour in a hot area of San Francisco. By analyzing the historical data of the number of vehicles in the area, bicycle companies can know whether hot spots need vehicle resupply during the period of borrowing.

The following picture shows the history of inbound and outbound traffic of an Internet service. From the figure, you can clearly see that the incoming traffic (blue line) has burrs in a certain period of time, and the service provider can check the service for any anomalies during this period of time. Can also be further based on traffic monitoring as an alarm, so that operation and maintenance personnel can deal with online problems in a timely manner.

(2) looking forward from the time axis, time series data can establish mathematical models, do statistical analysis, and predict the development trend of things.

For example, the following picture shows the population figures and projections released by the United Nations after analyzing past population growth trends in 2015. It can be seen from the picture that Africa's population will continue to grow in the future, which is a market that should not be ignored by any multinational company, and indicates that the local government is facing major challenges.

1.2 Mathematical model of time series data

The above introduces the basic concept of time series data and explains the significance of analyzing time series data. So how should time series data be stored? The storage of data should consider its mathematical model and characteristics, and time series data is no exception. So here we first introduce the mathematical model and characteristics of time series data.

The following figure shows a period of time series data, which records the inbound and outbound traffic of each port on each machine in a cluster over a period of time, and records an observation every half an hour. Take the data in the figure as an example to introduce the mathematical model of time series data (the terms of basic concepts may be different in different time series databases. Here, Tencent CTSDB prevails):

Metric: a measured dataset, similar to table in a relational database

Point: a data point, similar to row in a relational database

Timestamp: timestamp indicating the time point at which the data was collected

Tag: dimension column, which represents the attribution and attribute of the data, indicates which device / module is generated, and generally does not change with time. It can be used for query.

Field: indicator column, which represents the measured value of the data, fluctuates smoothly over time and does not need to be queried.

As shown in the figure above, the metric of this group of data is Network, and each point consists of the following parts:

Timestamp: timestamp

Two tag:host and port, representing which port of which machine each point belongs to

Two field:bytes_in and bytes_out, representing the measured value of piont, the average of inbound and outbound traffic within half an hour

The same host and the same port produce a point every half an hour. With the growth of time, the field (bytes_in, bytes_out) is constantly changing. For example, during the period of host:host4,port:51514,timestamp from 02:00 to 02:30, bytes_in rose from 37.937 to 38.089, and from 2897.26 to 3009.86, indicating that the service pressure of this port increased during this period.

1.3 characteristics of time series data

Data pattern: the time series data grows over time, the values of the same dimensions are repeated, and the indicators change smoothly: this can be seen from the data changes in the Network table above.

Writes: continuous high concurrent writes, no update operations: time series databases are often faced with real-time data writes of millions or even tens of millions of terminal devices (for example, mobike has 10 million vehicles nationwide in 2017), but most of the data represent the status of the devices and will not be updated after writing.

Query: statistical analysis of indicators according to different dimensions, and there are obvious hot and cold data, generally only frequent query of recent data.

two。 Time series database

Once you have the timing data, where should it be stored? First of all, let's take a look at what problems traditional solutions encounter when storing time series data.

2.1 traditional solution

Time series data is often generated by millions or even tens of millions of terminal devices, and the write concurrency is relatively high, which belongs to massive data scenarios. There are mainly two kinds of traditional time series data solutions: relational database (MySQL) and Hadoop ecology.

MySQL: the following problems exist in massive time series data scenarios

High storage cost: poor compression of time series data, which takes up a lot of machine resources

High maintenance cost: stand-alone system, need manual sub-database table in the upper layer, high maintenance cost

Low write throughput: low write throughput on a single machine, so it is difficult to meet the write pressure of tens of millions of timing data.

Poor query performance: suitable for transaction processing, poor aggregate analysis performance of massive data.

Hadoop ecology (Hadoop, Spark, etc.)

High data latency: offline batch processing system, data from generation to analysis, taking hours or even days

Poor query performance: can not make good use of the index, rely on MapReduce tasks, query time is generally in the minute level.

2.2 time series database

Time series database is a specialized database for managing time series data, and the processes of writing, storage and query are optimized according to the characteristics of time series data, which are closely related to the characteristics of time series data.

1) Storage cost:

Making use of the characteristics of increasing time, repeated dimensions and smooth change of index, the coding compression algorithm is selected reasonably to improve the data compression ratio.

By reducing the precision in advance, the historical data are aggregated to save storage space.

2) highly concurrent writes:

Write data in batches to reduce network overhead

The data is first written to memory, and then the periodic dump is immutable file storage.

3) low query latency and high query concurrency:

Optimize common query modes and reduce query delay through indexing and other techniques

Improve query concurrency through caching, routing and other technologies.

2.3 comparison of open source time series databases

At present, the popular open source time series database products in the industry are InfluxDB, OpenTSDB, Prometheus, Graphite and so on. The comparison of their product features is shown below:

As can be seen from the above table, the open source time series database has the following problems:

No free, easy-to-use distributed version (OpenTSDB supports distributed deployment, but relies on too many systems and high maintenance costs)

The ability of aggregation is generally weak, and most of the time series data need to be statistically analyzed.

There is no authority management for free.

There is no multi-dimensional comparative analysis tool for time series.

CTSDB

Tencent CTSDB (Cloud Time Series Database) is a distributed, high-performance, multi-shard, self-balanced timing database, which is optimized for high concurrent writing of timing data, obvious hot and cold data, and IoT user scenarios. It also supports log parsing and storage in various industries. Its architecture is shown in the following figure.

1. Main features of CTSDB

1) High performance: (specific performance data will be given later)

Supports batch writes and highly concurrent queries

Improve system performance linearly at any time through cluster expansion

Support sharding, routing, speed up the query.

2) High reliability:

Support for multiple copies

Rack-aware, automatically stagger the rack to assign master-slave copies.

3) easy to use:

Rich data types, REST interface, data writing query all use json format

Native distributed, flexible and scalable, automatic data equalization

4) low cost:

Support column storage, high compression ratio (about 0.1), reduce storage cost

Support data pre-drop accuracy: reduce storage costs while improving query performance.

The number of copies can be adjusted as needed.

5) strong aggregation ability:

Common aggregates such as max,min,avg,percentile,sum,count,group by

Complex script aggregation (for example, aggregation of calculation results between multiple fields)

Time interval aggregation, GEO aggregation, nested aggregation.

6) highlight ability:

Data monitoring alarm: monitor the amount of data, field statistics, baseline comparison and other monitoring of the stored data, and alarm through Wechat, SMS and email

Permission system: a permission system that supports user name, password and machine whitelist

Data timeliness: supports expired data deletion

Data export.

two。 Performance comparison test of competitive products

Here, InfluxDB, which is more popular in the industry, is selected to compare the performance with CTSDB.

2.1 Test scenario

Comparison test of CTSDB and InfluxDB: both CTSDB and InfluxDB are deployed on a single node, which occupies 24 cpu cores, 128g memory, 10 Gigabit network cards, and disk SSD RAID0.

CTSDB single-node cluster and two-node cluster comparison test: to verify the linear scalability of CTSDB.

2.2 write performance testing

Sample data:

The imported data is generated by InfluxDB's official testing tool, https://github.com/influxdata/influxdb-comparisons.

The data is time series data of several host. Each point contains 10 tag (all of string type), 10 filed (all of float type), and timestamp is a timestamp (one host every 10 seconds).

The example is as follows:

Test results:

(1) comparison of write performance between CTSDB single-node cluster and InfluxDB stand-alone

Abscissa: number of concurrency (number of writing threads), ordinate: QPS (unit: ten thousand times / s)

Conclusion: the highest write performance of CTSDB single node is in 19w MagneInfluxDB and 15w.

(2) comparison of write performance between CTSDB single-node cluster and CTSDB two-node cluster

Abscissa: number of concurrency (number of writing threads), ordinate: QPS (unit: ten thousand times / s)

Conclusion: the maximum write performance of CTSDB single-node cluster is 20w, and the write performance of dual-node cluster is 34w.

2.3 query performance testing

Sample query:

Take the query statement of CTSDB as an example:

Query sentence interpretation:

Take out the full amount of data of 1 host, then filter for one hour, divide the buckets according to the minute granularity (groupby, the final result has 60 buckets), and finally output all buckets, and calculate the maximum value of usage_user field of all data in the bucket.

Note that the query here uses CTSDB's routing feature to speed up the query.

Sample query results:

Test results:

(1) comparison of query performance between CTSDB single-node cluster and InfluxDB stand-alone

Abscissa: number of concurrency (number of query threads), ordinate: QPS (in times / s)

Conclusion: the overall query performance of CTSDB is much better than that of InfluxDB. When the number of concurrency is high (40), the query performance of CTSDB is nearly 4 times higher than that of InfluxDB, about 2w. When the number of concurrent threads reaches 50, InfluxDB has a link error and rejects the query request; at this time, CTSDB can query normally.

(2) comparison of query performance between CTSDB single-node cluster and two-node cluster

Abscissa: number of concurrency (number of query threads), ordinate: QPS (in times / s)

Conclusion: in the case of high concurrency, the query performance of two-node cluster is much better than that of single-node cluster, showing a trend of linear expansion of query performance.

About us.

Our current situation

As Tencent's only time series database, CTSDB supports more than 20 core businesses within Tencent (Wechat × ×, Tenpay, Cloud Monitoring, Cloud Database, Cloud load, etc.). Among them, the cloud monitoring system records the real-time status of various software and hardware systems in Tencent, and CTSDB carries all its data storage. It runs stably under the write pressure of 10 million data points per second and 20TB + data per day, which proves that CTSDB can stably support the massive data scenarios of the Internet of things.

Our future

CTSDB will be officially launched in Tencent Cloud to provide technical services for the couplet industry! We will further optimize CTSDB in terms of reducing storage costs, improving ease of use, and rich functionality! Welcome students who are interested in temporal database and distributed storage to join us!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.