Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to integrate Ganglia and Nagios

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to integrate Ganglia and Nagios, with a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.

Basic introduction

Ganglia:Ganglia is an open source cluster monitoring project launched by UC Berkeley, designed to measure thousands of nodes. The core of Ganglia consists of gmond, gmetad, and a Web front end. It is mainly used to monitor the system performance, such as cpu, mem, hard disk utilization, Imax O load, network traffic and so on. It is easy to see the working status of each node through the curve, which plays an important role in reasonably adjusting and allocating system resources and improving the overall performance of the system.

Nagios:Nagios is an open source computer system and network monitoring tool, which can effectively monitor the host status of Windows, Linux and Unix, network settings such as switches and routers, printers and so on. When the system or service status is abnormal, send an email or SMS alarm to notify the website operation and maintenance personnel as soon as possible, and send out a normal email or SMS notification after the status is restored.

Architecture

The advantage of Ganglia lies in the real-time monitoring of various indicators of the machines in the cluster, such as cpu, memory, disk, temperature and other data, which are summarized into a variety of graphical interfaces, and provide interfaces for calling data. When there is a problem, the alarm prompt function is relatively weak.

The advantage of Nagios is that it can provide powerful alarm prompt function when there is a problem, but in real-time monitoring, the function is weak, even using NRPE local plug-in can not provide powerful machine monitoring.

In the cluster operation and maintenance, there are two ways, the first, when the problem occurs, you can get an alarm, the operation and maintenance personnel can quickly attack to solve the problem and reduce the loss to a minimum. Second, before the problem appears, find the possible problem, solve the problem, and avoid the problem.

Therefore, Nagios is suitable for the first scenario, and Ganglia is suitable for the second scenario. The combination of the two can effectively solve a variety of scenarios. Of course, there are other monitoring and alarm software, such as Monitorix,NetXMS,cacti,Zabbix and so on.

Here, we choose the most mature Ganglia and Nagios.

Environment introduction

1. Ganglia has been installed in the cluster (for the installation process, please refer to my previous blog http://blog.csdn.net/shifenglov/article/details/40587527)

two。 Nagios has been installed in the cluster (for the installation process, please refer to this blog http://www.cnblogs.com/mchina/archive/2013/02/20/2883404.html)

Installation idea

The monitoring index of the whole cluster is obtained by calling the interface of Ganglia through Nagios, and an alarm prompt is given if it exceeds the set alarm threshold.

Installation process

1. Copy the check_ganglia.py script to the execution directory of nagios

If there is source code, then check_ganglia.py is in ganglia-3.6.0/contrib/check_ganglia.py

If there is no source code, you can download check_ganglia.py, which can be easily found.

# cp check_ganglia.py/usr/local/nagios/libexec/

#! / usr/bin/env python import sysimport getoptimport socketimport xml.parsers.expat class GParser: def _ _ init__ (self, host, metric): self.inhost = 0 self.inmetric = 0 self.value = None self.host = host self.metric = metric def parse (self File): P = xml.parsers.expat.ParserCreate () p.StartElementHandler = parser.start_element p.EndElementHandler = parser.end_element p.ParseFile (file) if self.value = = None: raise Exception ('Host/value not found') return float (self.value) def start_element (self, name, attrs): if name = = "HOST": if attrs ["NAME"] = self.host: self.inhost=1 elif self.inhost==1 and name = = "METRIC" >

After modification (note the modification of ganglia_host and ganglia_port variables in the above file)

. / check_ganglia.py give it a try, no problem

-h specifies the host. It should be noted here that the host name is filled in here. Provided that the IP can be parsed.

You can see the corresponding hostname in / var/lib/ganglia/rrds/my cluster/

What parameters are detected by-m can be seen in the rrds directory. .rrd is not included in the command

-w warning

-c critical

For example

. / check_ganglia.py-h 10.20.1.131-m load_one-w 4-c 5

two。 Append command to get ganglia data

# vim / usr/local/nagios/etc/objects/commands.cfg

The additional content is as follows:

Define command {command_name check_ganglia command_line $USER1 $/ check_ganglia.py-h $HOSTADDRESS$-m $ARG1 $- w $ARG2 $- c $ARG3 $}

3. Append the host information where the monitoring data is located (the file is newly appended)

# vim / usr/local/nagios/etc/objects/hosts.cfg

The contents of the document are as follows:

Define host {use linux-server host_name test address 10.20.1.131} define hostgroup {hostgroup_name ganglia-servers alias ganglia-servers members test}

4. Append monitoring metrics information (file is new)

# vim / usr/local/nagios/etc/objects/services.cfg

The contents of the document are as follows:

Define servicegroup {servicegroup_name ganglia-metrics alias Ganglia Metrics} define service {use ganglia-service host_name test hostgroup_name ganglia-servers service_description load_one check_command check_ganglia!load_one!4!5} define service {use ganglia-service host_name test hostgroup_name ganglia-servers service_description mem_free check_ Command check_ganglia!mem_free!50000!40000}

5. Append template information

# vim / usr/local/nagios/etc/objects/templates.cfg

The additional content is as follows:

Define service {use generic-service name ganglia-service hostgroup_name ganglia-servers service_groups ganglia-metrics register 0}

6. Append profile association

# vim / usr/local/nagios/etc/nagios.cfg

The additional content is as follows:

# introduce host file cfg_file=/usr/local/nagios/etc/objects/hosts.cfg# import monitoring item file cfg_file=/usr/local/nagios/etc/objects/services.cfg

7. Modify the gmetad configuration to enable its share monitoring data

By default, ganglia's gmetad service does not share monitoring metrics to other machines on the network, so it can only transfer data to localhost by default, so it needs to be configured so that it can share the corresponding data to other machines. The main consideration is that the host of nagios is not on the same machine as the host of ganglia.

# vi / etc/ganglia/gmetad.conf

The modifications are as follows:

Trusted_hosts 10.20.1.158 # # add a trusted host IP

8. Restart ganglia and nagios services

Ganglia:

# service ganglia-monitor restart

# service gmetad restart

Nagios:

# service nagios restart

9. Visit

Thank you for reading this article carefully. I hope the article "how to integrate Ganglia and Nagios" shared by the editor will be helpful to everyone. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report