Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the problem of victoriaMetrics Agent performance Optimization

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the victoriaMetrics agent performance optimization problem how to solve the relevant knowledge, the content is detailed and easy to understand, the operation is simple and fast, has a certain reference value, I believe that everyone after reading this victoriaMetrics agent performance optimization problem how to solve the article will have a harvest, let's take a look at it.

cause

Recently, there is a small project for Prometheus metrics agent, temporarily called prom-proxy, which aims to analyze specific indicators (such as container, traefik, istio, etc.), and then add application ID to the original indicators (of course, there are other metric operations, not for the time being). After a simple local verification, it was released to the joint survey environment, and everything was normal for a few weeks, thinking that everything was fine with each other. But just because you think you're okay doesn't mean you're okay.

Yesterday, data loss occurred in both the old environment and the new prom-proxy environment, as shown in the following figure:

Prom-proxy has a self-service metric request_total, which is found to grow very slowly, so it is suspected to be a problem with the sender at first (this is a misunderstanding, and we will talk about why caching is added later).

Further inspection shows that the following error occurred at the upstream sender (using the vmagent component of victoriaMetrics), indicating that the data consumed by prom-proxy cannot keep up with the data generated by vmagent:

2022-03-24T09:55:49.945Z warn VictoriaMetrics/app/vmagent/remotewrite/client.go:277 couldn't send a block with size 370113 bytes to "1:secret-url": Post "xxxx": context deadline exceeded (Client.Timeout exceeded while awaiting headers); re-sending the block in 16.000 seconds

When this kind of problem occurs, the first thing that comes to mind is to add concurrent processing capabilities. The current number of concurrent processing is 8 (that is, the number of goroutine in the background). Considering that the core of the online host has 30 seconds, the number of concurrent processing is directly increased to 30. It was verified that there was no improvement.

Another way that comes to mind is caching, such as using kafka or using golang's native cache chan. However, there are also problems in using the cache. If the consumption capacity of the downstream can not keep up, there will be a large backlog of data in the cache, and the Prometheus monitoring indicators are timely, and the backlog of data for too long is not high and a waste of storage space.

The following is an example of using cached chan, where the initial size of s.reqChan is set to 5000 and cacheTotal metrics are used to observe cache changes. In this way, data reception and processing become asynchronous (but not completely asynchronous).

At the beginning, it was mentioned that using request_total to view upstream requests is a misunderstanding, because request statistics and request processing are synchronized, so if the request is not processed, the next request cannot be received and the request_total cannot be increased.

Func (s * Server) injectLabels (w http.ResponseWriter, r * http.Request) {data, _: = DecodeWriteRequest (r.Body) s.reqChan

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report