Chinese version of OpentTsdb official documentation-Volume and pre-aggregation 04/20 Update SLTechnology News&Howtos

Chinese version of OpentTsdb official documentation-Volume and pre-aggregation

2025-04-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

although TSDB is designed to store raw full resolution (resolution) data as long as space is available, queries over a wide time range or on many tag combinations can be quite painful. Such a query may take a long time to complete or, in the worst case, may result in an out-of-memory exception. Starting with OpenTSDB 2.4, a new set of API allows lower resolution data to be stored and queried to answer such queries more quickly. This page provides an overview of what scrolls and preaggregations are, how they work in TSDB, and how best to use them. Check out the implementation details of the API section.

Note:

OpenTSDB itself does not calculate and store rollup or pre-aggregated data. There are many methods to calculate the results, but according to the scale and accuracy requirements, they all have advantages and disadvantages. See the generate volume and pre-aggregation section to discuss how to create this data.

Sample data

To help describe lower resolution data, , let's look at some sample data with complete resolution (also known as raw data). The first table defines the time series with a shortcut identifier.

Sequence IDMetricTag1Tag2Tag3ts1system.if.bytes.outhost=web01colo=lgainterface=eth0ts2system.if.bytes.outhost=web02colo=lgainterface=eth0ts3system.if.bytes.outhost=web03colo=sjcinterface=eth0ts4system.if.bytes.outhost=web04colo=sjcinterface=eth0

Notice that they all have the same metric and interface tags, but different host and colo tags.

Then write out some data at 15-minute intervals:

Sequence ID12:0012:1512:3012:4513:0013:1513:3013: 45ts114-382452ts2728-9411ts393-2-16382ts425285-47

Please note that some data points are missing. With these data sets, let's take a look at the volume first.

Roll up

in OpenTSDB, "roll up" is defined as a single time series that aggregates over time. It can also be called "time-based aggregation". The volume helps solve problems over a wide range of time spans. For example, if you write a data point every 60 seconds and query the data for a year, the time series will return more than 525000 individual data points. The chart will have a lot of points and is likely to be quite messy. Instead, you may need to view data with lower resolution, for example, it only takes about 8k for an hour to draw. You can then identify anomalies and drill down for finer resolution data.

if you have used OpenTSDB to query data, you may be familiar with downsamplers that aggregate each time series into smaller or lower resolution values. The roll up is essentially the result of downsampling stored in the system and is called at will. Each roll-up (or drop sampler) requires two pieces of information:

Interval (Interval): how long to "scroll" to the new value. For example, one hour (1h) of data or one day (1D) of data. Aggregate function: what arithmetic operation is performed on the base value to get a new value. For example, sum: accumulate all values or max: store the maximum value.

Warning

When stores rollup, it is best to avoid functions such as average, median or deviation (average, median or deviation). These values become meaningless in further downsampling or packet aggregation. On the contrary, it is much better to always store the sum (sum) and count (count), at least the average can be calculated at query time. For more information, see the following section.

The timestamp of the rollup data point should capture (snap to) the top of the volume interval. For example, if the summary interval is 1 hour, then it contains 1 hour of data and should be snap to to the top of the hour. Because all timestamps are written in Unix Epoch format and defined as the UTC time zone, this will be the beginning of an hour of UTC time.

Example of previous volume

For the data given above, use an interval of 1 hour to store sum and count.

Sequence ID12:0013:00ts1 SUM105ts1 COUNT44ts2 SUM86ts2 COUNT43ts3 SUM919ts3 COUNT44ts4 SUM916ts4 COUNT34

, note that whenever the first data point in the interval "bucket" appears, all timestamps are aligned to the top of the hour. Also note that if a data point does not exist for a period of time, the count is low.

in general, when each time series is stored for rollup, the goal should be to calculate and store MAX,MIN,SUM and COUNT.

Example of average roll up

when rollup is enabled and a downsampler that uses OpenTSDB's avg function is requested, TSD scans the storage for SUM and count values. Then when iterating over the data, it calculates the average precisely.

The timestamps of the COUNT and sum values must match. However, if a SUM's expected count count value is missing, the SUM will be kicked out of the result. From the example above, we have now lost a count data point ts2.

Sequence ID12:0013:00ts1 SUM105ts1 COUNT44ts2 SUM86ts2 COUNT4

The resulting 2-hour downsampled avg query should look like this:

Sequence ID12:00ts1 AVG1.875ts2 AVG2ts1: (10: 5) / 8=1.875ts2: 8 beat 4: 2 prepolymerization

Although rollup can help with long-time queries, if the metric has a high cardinality (that is, the unique number of time series of a given metric), it may still encounter small-scale query performance problems. In the above example, we have four Web servers. Even though we have 10000 servers. Getting the sum or average of network traffic (interface traffic) can be quite slow. If users often pull a large number of grouped collections (or some use it as spatial collections), it makes sense to store aggregations and query substitutions, which will get less data.

Unlike the previous volume, requires only one additional piece of information for preaggregation:

Aggregate function: what arithmetic operation is performed on the base value to get a new value. For example, sum: accumulate all values or max: store the maximum value.

in OpenTSDB, preaggregation is different from other time series with special labels. The default label key is _ aggregate (which can be configured through tsd.rollups.agg_tag_key). The aggregate function used to generate the data is then stored in uppercase in the tag value. Let's look at an example:

Prepolymerization exampl

considering the example at the top, we might want to look at the total network traffic of colo (data center). In this case, we can aggregate through SUM and COUNT, similar to the previous volume. The result will be four new time series associated with metadata, such as:

Sequence IDMetricTag1Tag2ts1'system.if.bytes.outcolo=lga_aggregate=SUMts2'system.if.bytes.outcolo=lga_aggregate=COUNTts3'system.if.bytes.outcolo=sjc_aggregate=SUMts4'system.if.bytes.outcolo=sjc_aggregate=COUNT

, please note that these time series have discarded the host and interface tags. This is because, during the aggregation process, several different values of host and interface have been packaged into this new sequence, and it will no longer make sense if they are used as tags. Also notice that we have injected the new label _ aggregate into the stored data. Queries can now access this data by specifying a value of _ aggregate.

Note:

"with enabled for rollup, if you plan to use preaggregation, you may need to help distinguish the raw data in the preaggregation by letting TSDB automatically inject _ aggregate=RAW." Simply set the tsd.rollups.tag_raw property to true.

The result data of is as follows:

Sequence ID12:0012:1512:3012:4513:0013:1513:3013:45ts1'865-16-463ts2'22222122ts3'953114849ts4'12222222

because we are performing grouping through aggregation (through colo grouping), we get a value from the original dataset for each timestamp. In this case, we do not use downsampling or rollup.

Aggregation algorithm:

is grouped according to colo, so ts1 is grouped with ts2, and ts3 is grouped with ts4.

Ts1' = > SUM: 1 "7" 8 4 "2" 6... Ts2' = > COUNT: 2 2... Ts3' = > SUM: 9 / 3 / 2 / 5... Ts4' = > COUNT: 1 2...

Warning

is the same as the previous volume, when writing pre-aggregation, it is best to avoid functions such as average, median, or deviation. Just store the sum and count.

Roll-up prepolymerization

While preaggregation is certainly helpful for high cardinality metrics, users may still want queries to access a wide time span, but encounter slow queries. Thankfully, roll-up preaggregation can be done just like the original data. Just generate the preaggregation and roll it up using the information above.

Generate rollup and pre-aggregation

currently, TSD does not generate rollup data or pre-aggregated data for you. The main reason is that it means that OpenTSDB handles a large amount of time series data, so a single TSD focuses on storing the data in storage as quickly as possible.

problem

Due to the (basically) stateless nature of TSD, may not have a full set of data for performing pre-aggregation. For example, our sample ts1 data may be written to TSD_B while ts2 is written to TSD_A. The data is not read from the storage, and the correct grouping cannot be performed. We don't know when we should prepolymerize. We can wait for 1 minute and then aggregate the data in advance, but we will miss anything that happens after that minute. Or we can wait for an hour, and the pre-aggregated query will not have the last hour of data. What happens if the data comes too late?

in addition, for rollup, depending on how the user writes data to TSD, for ts1, we may receive the 12:15 data point on TSD_A, but the data reaches TSD_B at 12:30, so we don't need data for the whole hour (neither has the data required for the full hour). The time window limit also applies to rollup.

Solution

The use of rollup and preaggregation in requires some choice between analysis and trade-offs. Since some OpenTSDB users already have the ways and means to calculate this kind of data, we only need to provide API for storage and query. However, here are some tips on how to calculate these yourself.

Batch processing

A common method for other time series databases is to read data from the database after a delay, calculate pre-aggregation and rollup, and then write the data. This is the easiest way to solve this problem and works well on a small scale. But there are still some problems:

As the data grows, so does the number of queries that generate volumes, so that the query load affects write and user query performance. OpenTSDB encounters the same problem when enabling data Compaction under HBase. Also as data grows, more data means that batch processing takes longer and must be shared across multiple Worker, which can be a coordination and troubleshooting challenge. Delayed or historical data may not be rolled up unless there is some tracking means to trigger the generation of new batches of old data.

Some ways to improve batch processing include:

Reading data from the replica system, for example, if the HBase replica is set, allows the user to query from the Master system to read from the replica store to aggregate. Read from other associated storage. One example is to mirror all data to another storage, such as HDFS, and run a batch job against that data. Queue up on TSD

Another option used by some databases is to arrange all the data in memory in the process and write the results after a configured time window. However, because TSD is stateless, users usually place a load balancer before their TSD, so a single TSD may not be able to get all the data on the entire volume or pre-aggregate for calculation (as described above). For this method to work, the upstream collector must send all the data needed for the calculation to a specific TSD. This is not a difficult task, but the problems faced include:

There is enough RAM or disk space to localize (spool spit) data for each TSD. If a TSD process dies, you will either lose the aggregated data or have to boot from storage (bootstrapped). Whenever an aggregate calculation occurs, the overall write throughput of the original data is affected. There are still latency / historical data issues. Because TSDB is JVM-based, saving all this data in RAM and then running GC will have a lot of impact. (spooling to disk is better, but you will encounter IO problems)

generally speaking, queuing on Writer is a bad idea. Avoid pain.

Flow treatment

A better way for to handle rollup and pre-aggregation is to send data to a streaming system, where it can be processed and written to TSD in near real time. It is similar to the queuing option on TSD, but uses one of the countless flow processing frameworks (Storm,Flink,Spark, etc.) to handle message routing and in-memory storage. Then you just need to write some code to calculate the aggregation and spit out the data after the window has passed.

this is the solution used by many next-generation monitoring solutions, such as Yahoo! Yahoo is working to open the streaming system to others who need large-scale monitoring and plug it neatly into the TSDB.

Although streaming is better, you still encounter the following problems:

Enough resources for stream workers to finish their homework. Dead stream worker needs to be booted from storage. Delay / historical data must be processed. Share

if you have the working code to calculate the aggregation, please share it with the OpenTSDB group. If your solution is open source, we may incorporate it into the OpenTSDB ecosystem.

Configuration

for Opentsdb 2.4, the volume configuration is referenced by Key tsd.rollups.config in opentsdb.conf. The value of this key must be an JSON string escaped by quotation marks, without a newline character, or preferably the path containing the configured JSON file. The file name must end with .json like rollup_config.json.

The JSON configuration should look like this:

{"aggregationIds": {"sum": 0, "count": 1, "min": 2, "max": 3}, "intervals": [{"table": "tsdb", "preAggregationTable": "tsdb-preagg", "interval": "1m" "rowSpan": "1h", "defaultInterval": true}, {"table": "tsdb-rollup-1h", "preAggregationTable": "tsdb-rollup-preagg-1h", "interval": "1h", "rowSpan": "1D"}]}

Two top-level fields include:

AggregationIds: mapping of OpenTSDB aggregate function names to numeric identifiers for compressed storage.

Intervals: a list of one or more interval definitions containing table names and interval definitions.

AggregationIds

The aggregation ids map is used to reduce storage by pre-adding digital ID to each type of rollup data, rather than spelling out a complete summary function. For example, if we COUNT: add 6 bytes to each column (or compressed column) of each value, we can save it using an ID.

ID must be an integer from 0 to 127. This means that we can store up to 128 different rolls in each interval. Only one ID can be provided for each numeric value in map, and only one aggregate function can be given for each type. If the function name is not mapped to an aggregate function supported by OpenTSDB, an exception is thrown at startup. Again, at least one aggregation must be given before TSD can be started.

Warning

Once starts writing data, the aggregate ID cannot be changed. If you change the mapping, incorrect data may be returned, or query and write operations may fail. You can add functions at any time, but you cannot change the mapping.

Intervals

Each interval object in defines a table route so that rollup and pre-aggregated data can be written and queried. There are two types of intervals:

Default-this is the default, raw data, the OpenTSDB table is defined by "defaultInterval": true. For existing installations and deployments, it is the tsdb table or any table defined by tsd.storage.hbase.data_table. Ignoring intervals and spans, the default is OpenTSDB 1 hour row width, and stores data for a given resolution and timestamp. Only one default value can be configured for each TSD and configuration at a time. Rollup interval-anything set to "defaultInterval": false or the default interval is not set. These are roll-up tables where values are captured to the interval boundary.

should define the following fields:

The name data type Required describes the base or rollup table of the sample tableStringRequired data that is not pre-summarized. For the default table, it should be the table to which the tsdb or existing raw data is written. For rolled-up data, it must be a different table from the original data. Tsdb-rollup-1hpreAggregationTableStringRequired should write pre-aggregated data and (optionally) a table of rolled-up data. It may be the same as the table value and is the same table. The tsdb-rollup-preagg-1hintervalStringRequired format is the expected interval between data points in "". "for example, if rollup is calculated every hour, the interval should be 1 hour." If calculated every 10 minutes, set it to 10m. For default tables, this value is ignored. The width of each row in the 1hrowSpanStringRequired store. This value must be greater than the number defined in interval and interval. For example, if interval is 1h and RowSpan is 1 day, then each row will have 24 values. Whether 1ddefaultIntervalBooleanOptional uses the configured interval as the default value for the original unscrolled data. True

in storage, the volume is written similar to the original data, each row has a base timestamp, and each data point is the offset from the base time. Each offset is an increment other than the base time, not the actual offset. For example, if a row stores one day's of and one hour's data, there will be up to 24 offsets. The offset 0 maps to the midnight row, and the offset 5 maps to 6 a.m. Because the roll-up offset is encoded in 14 bits, an error is raised when TSD starts if there are too many intervals stored in a row to accommodate the 14 bits.

Warning

After writes the data to TSD, do not change the interval width or row span of the rollup interval. This will result in junk data and queries that may fail.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.