Chinese version of OpentTsdb official documents-query performance 07/11 Update SLTechnology News&Howtos

Chinese version of OpentTsdb official documents-query performance

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Caching

OpenTSDB does not have a built-in cache at this time (except for the built-in GUI that will cache PNG image files for 60 seconds). Therefore, we can only rely on the cache of the underlying database. In HBase (the most common OpenTSDB backend), there is a concept of block caching that stores blocks of rows and columns in memory when writing and / or reading. Nick Dimiduck's Block Cache 101is a good primer. A good way to set up the cache is to use the BucketCache cache and set the L1 cache size to be quite large so that it can act as a write cache and keep most of the latest data in memory. Then, when the user runs the query, the L2 cache can keep the frequently queried data in memory.

takes a closer look at region server's GC pause. Users usually run bucket cache in out-of-heap mode, but serialization between Java and JNI still comes at a cost in out-of-heap cache hits and writes.

also, make sure that compression is enabled for the HBase table. Blocks are stored in memory using the compression algorithm specified in the table, so more compressed blocks can be cached than uncompressed blocks.

One Out of Many Queries

if you typically look for queries for one or two time series in a metric (that is, multiple label values are different), make sure that version 2.3 or later is used and that explicitTags is enabled in the query. The query must list all the tags key associated with the data you are looking for, but it enables special filters on the HBase, which will help reduce the number of rows scanned. Please refer to query filter for details.

or, if you put a high cardinality tag in the metric name, this will greatly reduce the amount of data scanned during the query and improve performance. See Writing data for more information

High cardinality query

The best way for to improve the performance of queries that aggregate many time series is to run OpenTSDB 2.2 or later with salting enabled and run multiple regionserver in the HBase cluster. This executes the query in parallel, taking a subset of data from each regionserver and merging the results. For example, for a single regionserver, the query may take 10 seconds to complete. When using salting to write the same data to five regionserver, the same query takes about 2 seconds, which is determined by the time required for the slowest regionserver response. Merging collections is usually trivial.

Wide time range query

If observes a bottleneck between TSD and consumer applications, such as UI or API clients, then downsampling can be used and benefit from queries that view a wide time range (such as months or years). Using a drop sampler will reduce the amount of data serialized by TSD and sent to the user.

However, if there is a bottleneck between storage (HBase) and TSD, the best solution is to start writing data to the volume using OpenTSDB version 2.4 or later. This requires the external system to calculate the time-based rollup and write it to storage. Alternatively, the UI or API client can execute multiple queries and merge the results against multiple TSD with a small time span. In the future, we plan to add these features directly to TSD.

General optimization

Other things that needs to consider:

Multiple readable TSD

runs multiple TSD dedicated to reading data and places load balancers in front of them. This is the most common setting observed when running OpenTSDB and allows you to upgrade TSD by rotation without shutting down the entire system.

Tuning storage

HBase has many parameters that can be adjusted, and in general, most OpenTSDB bottlenecks come from HBase. Be sure to monitor the server, especially queues, caching, response time, CPU and GC.

Educational users

does not have a database system to avoid long-running or resource-wasting queries. Users are required to start with a smaller time range, such as 1 hour, and gradually increase the time range. It can also be a bad idea to help users understand the cardinality and how to request high_cardinality_tag_key=*.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.