Early warning Index Collection in Agentless Monitoring practice 07/01 Update SLTechnology News&Howtos

Early warning Index Collection in Agentless Monitoring practice

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)06/01 Report--

Many companies have developed monitoring solutions that solve a series of problems such as indicator collection, display, and early warning.

The monitoring solution introduced in this paper consists of FluxDB, Grafana and Ansible: Ansible is responsible for continuously capturing server hardware metrics data and storing the data in FluxDB; Grafana is responsible for reading and displaying metrics data from FluxDB, setting thresholds and configuring alerts.

I. Development environment

The monitored object is restored through three local VMs: one monitor and two servers with access to the monitoring service (server1 and server2).

Use Vagrant to manage the development environment, execute the vagrant up monitor command, and open and configure the monitor server via Vagrantfile below. If you need to connect server1 and server2 to the monitoring service, you can start these two VM servers later.

Ansible is responsible for configuring the monitoring server, including installing InfluxDB, Grafana and Ansible, and configuring the monitoring service. To keep the code clean and structured, installation tasks for each tool are stored in separate YML folders. Include_tasks dynamically integrates grouped tasks into the overall process.

II. Monitoring Service Configuration

The monitoring service configuration steps are shown in the following monitoring-configuration.yml file. First, create the monitor database and generate APIs for completing various database operations. Interact with web services via Ansible URIs. All metrics extracted from the monitoring object server are stored in the monitoring database.

Next, create Grafana data source, connect to database InfluxDB, and read all indicator data. Grafana provides an API that enables maximum utilization of configuration through json-formatted content. In addition to the data sources, the Slack notification channel and the first panel were created.

The Slack notification channel points to the beta Slack workspace. Users can create their own workspace and invite operations staff to join. Also create an incoming webhook, replacing the json URL field value.

The initial panel displays the percentage of memory used. Users can add additional metrics or create new panels. You can set the threshold to 95%, so you can visually see the results of the display, and configure the alert: when the last five metrics are greater than or equal to 95%, send a notification to the Slack channel.

Ansible supports simultaneous execution of tasks on multiple servers. In addition, Ansible can learn about the grouping of target servers through the manifest file (/etc/ansible/hosts). During monitoring service configuration, a monitored_servers grouping is created in the inventory file. All servers in the group are automatically monitored.

After the server is connected to the monitoring service, in order to prevent Ansible from verifying SSH keys, the default function in Ansible configuration file (/etc/ansible/ansible.cfg) needs to be disabled to collect metrics of the newly added server.

Connect all monitored objects through Ansible playbook(playbook-get-metrics.yml) and extract all relevant metrics. Ansible playbook is located in/etc/ansible/playbooks directory. It is configured by CRON and executed once every minute: collect, store and display indicator data once every minute; if any problem is found, send an early warning.

III. Collecting indicator data

The playbook-get-metrics.yml file below is responsible for extracting all important metrics from monitored_servers and storing the collected data in the monitor database. The initial panel only collects memory usage ratios. Users can add tasks to the playbook and collect other metrics.

FluxDB data storage API is used to store metrics data in the monitoring database. 192.168.33.10 is the IP address of the monitoring server, and 8086 is the InfluxDB port number. In the database, the used memory key is used_mem_pct. Users need to configure appropriate keys for each metric.

Ansible collects target host information by default for task execution. For example, based on the hostname (ansible_hostname), you can determine from which server metrics were collected.

In addition, memory consumption percentages can be calculated from Ansible's collection of actual memory usage (ansible_memory_mb.real.used) and cumulative actual memory usage (ansible_memory_mb.real.total). With this data in hand, you can execute the ansible monitor -m setup -uvagrant -k -i hosts command. When SSH password pops up, type vagrant. The above information is in json format, and values can be accessed using dot symbols.

IV. Access server in monitoring service

Execute the vagrant up monitor command to start the monitoring server.

Then type http://192.168.33.10:3000 in your browser to access Grafana. The username and password are admin. Click on the used_mem_pct panel link to view the values for monitoring servers in the line chart.

Connect to other servers and view the values in the line chart. Start accessing other servers and view the values in the line chart. Start server1, execute vagrant up server1, and continue with ansible-playbookplaybook-add-server.yml -u vagrant -k -i hosts. - The u parameter is used to define the SSH user, the-k parameter prompts for a password, and the-i parameter is used to define the monitoring server.

After obtaining the new server IP address and SSH certificate, Ansible can interface with the server. A single line of code inserted into the monitoring server/etc/ansible/hosts file connects the server to the monitoring service. When CRON executes playbook-get-metrics.yml again, server1 becomes the monitored object. In this way, you can collect, store, and display all the indicator data of server1.

V. Conclusion

The monitoring solution described in this article is inexpensive and easy to implement, with the following benefits:

Ansible does not need to install agents in all monitoring objects; all indicator data is stored in FluxDB, a high-performance time series database; Grafana is used to display data uniformly, and early warning configuration is supported.

Written by Gustavo Carmo

How to Get Metrics for Advance Alert to Prevent Trouble

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.