In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
Today, I will talk to you about how to query data in the Prometheus time series database. Many people may not know much about it. In order to make you understand better, the editor has summarized the following contents for you. I hope you can get something according to this article.
Promql
An Promql expression can be evaluated into the following four types:
Instantaneous vector (Instant Vector)-a set of time series with the same timestamp (taken from different time series, such as CPU idle of different machines at the same time) interval vector (Range vector)-a set of time series scalars (Scalar) over a period of time-a floating-point data value string (String)-a simple string
We can also use set expressions such as svm/avg in Promql, but only on instantaneous vectors (Instant Vector). In order to explain the aggregate calculation of Prometheus and the reason of length, the author only analyzes the execution process of instantaneous vector (Instant Vector) in detail in this article.
Instantaneous vector (Instant Vector)
As mentioned earlier, an instantaneous vector is a set of time series with the same timestamp. But in the actual process, it is impossible for us to sample different Endpoint at exactly the same time. Therefore, Prometheus takes the closest data (Sample) before the specified timestamp. As shown in the following figure:
Of course, if the data is 1 hour from the current timestamp, it certainly can't be included in our return result.
So Prometheus filters the data through a specified time window (specified by the startup parameter-query.lookback-delta, the default 5min).
Analyze a simple Promql
Well, after explaining the concept of Instant Vector, we can start the analysis. Go straight to a Promql with an aggregate function.
SUM BY (group) (http_requests {job= "api-server", group= "production"})
First of all, for this kind of sentence with grammatical structure, we must Parse it into an AST tree. Call
Promql.ParseExpr
Because Promql is relatively simple, Prometheus directly uses LL parsing. The AST tree structure of the above Promql is given directly here.
Prometheus traverses the syntax tree through vistor mode, and the code is as follows:
Ast.go vistor design pattern func Walk (v Visitor, node Node, path [] Node) error {var err error if v, err = v.Visit (node, path); v = nil | | err! = nil {return err} path = append (path, node) for _, e: = range Children (node) {if err: = Walk (v, e, path) Err! = nil {return err}} _, err = v.Visit (nil, nil) return err} func (f inspector) Visit (node Node, path [] Node) (Visitor, error) {if err: = f (node, path); err! = nil {return nil, err} return f, nil}
Through the very convenient functional function in golang, the evaluation function inspector is passed directly to evaluate in different situations.
Type inspector func (Node, [] Node) error
Evaluation process
The core functions of the evaluation process are:
Func (ng * Engine) execEvalStmt (ctx context.Context, query * query, s * EvalStmt) (Value, storage.Warnings, error) {. Querier, warnings, err: = ng.populateSeries (ctxPrepare, query.queryable, s) / / get the corresponding sequence data here. Val, err: = evaluator.Eval (s.Expr) / / here aggregate calculation. }
PopulateSeries
First of all, the series (time series) corresponding to VectorSelector Node is calculated by populateSeries. Here, the evaluation function is given directly.
Func (node Node, path [] Node) error {. Querier, err: = q.Querier (ctx, timestamp.FromTime (mint), timestamp.FromTime (s.End). Case * VectorSelector:. Set, wrn, err = querier.Select (params, n.LabelMatchers...). N.unexpandedSeriesSet = set. Case * MatrixSelector:. } return nil
You can see that this evaluation function only operates on VectorSelector/MatrixSelector, and our Promql is only valid on the leaf node VectorSelector.
Select
The core function to get the corresponding data is in querier.Select. Let's first take a look at how qurier is obtained.
Querier, err: = q.Querier (ctx, timestamp.FromTime (mint), timestamp.FromTime (s.End))
To generate querier according to the timestamp range, the most important thing is to calculate which block is within this time range and attach them to the querier. See function for details
Func (db * DB) Querier (mint, maxt int64) (Querier, error) {for, b: = range db.blocks {. / / iterate through blocks and pick block} / / if maxt > head.mint (that is, block in memory), then add it to the querier. If maxt > = db.head.MinTime () {blocks = append (blocks, & rangeHead {head: db.head, mint: mint, maxt: maxt,})}. }
Knowing which block the data is in, we can proceed to calculate the VectorSelector data.
/ / labelMatchers {job:api-server} {_ _ name__:http_requests} {group:production} querier.Select (params, n.LabelMatchers...)
With matchers, we can easily fetch the corresponding series through the inverted index. For the sake of space, let's assume that the data is all in headBlock (that is, in memory). Then our calculation of the inverted arrangement is shown in the following figure:
In this way, our VectorSelector node already has the final data storage address information, such as memSeries refId=3 and 4 in the figure.
If you want to know about data addressing on disk, you can refer to the author's previous blog for details.
Find the corresponding data through populateSeries, then we can get the final result through evaluator.Eval. The calculation uses post-order traversal, and the calculation of the upper node begins after the lower node returns the data. So naturally, let's calculate VectorSelector first.
Func (ev * evaluator) eval (expr Expr) Value {. Case * VectorSelector: / / get the corresponding Series checkForSeriesSetExpansion (ev.ctx, e) / / traverse all series for i via refId, s: = range e.series {/ / since we are considering instant query, we only cycle the data of for ts: = ev.startTimest ts = refTime once, that is, the ok: = it.Seek (refTime) passed by our instant query. If! ok | | t > refTime {/ / because what we need is evelator.aggregation grouping key for group
The specific function is shown in the following figure:
Func (ev * evaluator) aggregation (op ItemType, grouping [] string, without bool, param interface {}, vec Vector, enh * EvalNodeHelper) Vector {. / / A pair of all sample for _, s: = range vec {metric: = s.Metric. Group, ok: = result [groupingKey] / / if this group does not exist, add a new group if! ok {. Result [groupingKey] = & groupedAggregation {labels: M, / / here our m = [group: production] value: s.V, mean: s.V, groupCount: 1,}. } switch op {/ / here is the final processing of SUM case SUM: group.value + = s.V. }}. For _, aggr: = range result {enh.out = append (enh.out, Sample {Metric: aggr.labels, Point: Point {V: aggr.value},})} Return enh.out}
Well, with the above treatment, the result of our aggregation is:
After reading the above, do you have any further understanding of how to query data in Prometheus time series database? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.