Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

NAGIOS monitoring system

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

Nagios monitoring system

Nagios monitoring system

Nagios is an open source free network monitoring tool, which can monitor the host status of Windows, Linux and Unix, switches, routers and other network devices, send out e-mail or SMS alarm when the system or service status is abnormal, and notify the website operation and maintenance staff as soon as possible. Traffic monitoring is not his strong point, traffic monitoring suggests using cacti (which can draw very intuitive graphics.

To sum up, nagios can mainly monitor the following aspects:

L whether the host is down (through the ping command, if ping is not available, the host will be considered to be in a state of downtime, but other services monitored will not be affected)

Server resources (cpu utilization, remaining hard disk space, etc.)

L Network Services (smtp\ pop3\ http\)

Monitor network devices (routers, switches, etc.)

First, the knowledge points that need to be understood.

1. The working principle of nagios

Nagios itself does not include the ability to monitor hosts and services. All the monitoring and monitoring functions are completed through various plug-ins. After installing nagios, nagios's own plug-ins are placed in / libexex under the nagios home directory, such as: check_disk is a plug-in for checking disk space, check_load is a plug-in for checking cpu load, and each plug-in can check its usage and function by running the. / check_xxx-h command.

2. Four monitoring states of nagios

Nagios can recognize four states to return information. 0 (OK) indicates normal status (green display), 1 (WARNING) indicates a warning (× × ×), 2 (CRITICAL) indicates a very serious error (red), 3 (UNKNOWN) indicates an unknown error (deep × × ×). Nagios judges the status of the monitoring object according to the value returned by the plug-in and displays it through web so that the administrator can find the fault immediately.

3. Nagios remotely manages the working process of the service through the nrpe plug-in.

1) Nagios executes the check_nrpe plug-in installed in it and tells check_nrpe which services to detect.

2) connect the NRPE daemon on the remote machine through ssl,check_nrpe.

3) NRPE runs various local plug-ins to detect the local server and status (check_disk,...etc).

4) NRPE sends the test results to the check_nrpe,check_nrpe on the host side, and then sends the results to the nagios status queue.

5) Nagios reads the information in the queue in turn, and then displays the results.

Second, the experimental environment

1. Experimental Topology

2. Experimental environment on virtual machine

Third, the experimental steps

1. Set up nagios monitoring system.

1) turn off the firewall

2) create nagios users and user groups

3) compile and install nagios (need to configure yum in advance)

Install the support pack:

Configuration:

Compile and install:

Note: install-webconf is installed to generate the configuration file, and the information added at the end of / etc/httpd/conf/httpd.conf can be copied to the / etc/httpd/conf.d/nagios.conf file instead of typing manually.

Interpretation of the above order:

Make install / / install the main program, CGI and HTML files

Make install-init / / install startup scripts in / etc/rc.d/init.d

Make install-commandmode / / configure directory permissions

Make install-config / / install sample configuration file

Make install-webconf / / installs the web interface of nagios and creates a nagios.conf file in the / etc/httpd/conf.d directory.

After the installation is complete, six directories will be generated under the / usr/local/nagios directory, which will be explained below.

The directory where the bin:nagios executor is located, and the nagios file is the main program.

Etc:nagios configuration file directory, when make install-config is finished, the default configuration file will appear under etc.

The directory where the sbin:nagios CGI file is located, where some external command execution programs are stored.

Share:nagios web page file directory, storing some html files.

Directory of var:nagios log files, pid, and other files.

Libexec: the storage location of the system default plug-in

4) add as system server

5) install the nagios plug-in (the monitoring function is completed through the plug-in)

Compile and install:

6) install nrpe (to monitor the remote server)

7) add the authorization at the end of the / etc/httpd/conf/httpd.conf file, and we can copy it into the / etc/httpd/conf.d/nagios.conf file without typing it by hand.

Use: r to import (navigate to the end of the document)

Just import, no modification, save and exit.

8) execute the htpasswd command to add an authorized user to access the nagios page

Both user name and password are nagiosadmin

9) start nagios and httpd

10) visit the nagios page on the browser

At present, you can only open the web page, many monitoring options can not be seen, if you need to monitor the remote server, you also need to do a lot of configuration, the following configuration.

2. Knowledge points involved in configuring nagios monitoring system

1) configuration file of nagios:

Nagios.cfg: main configuration file that defines the name and location of various configuration files

Cgi.cfg: a configuration file that controls CGI

Resource.cfg: resource files that define various variables so that other files can call

Objects: other configuration files are stored in the directory, which mainly includes:

Command.cfg: command configuration file that defines various command formats for other files to call

Contacts.cfg: contacts and groups can be called when sending alarm messages such as email.

Localhost.cfg: monitor the configuration file of the machine

Timeperiods.cfg: a configuration file that defines the monitoring time, so that other files can call

Hostgroups.cfg: define the monitoring hosts (groups), which need to be created manually.

2) relationship between configuration files

Several definitions involved in the configuration of nagios are host, host group, service, service group, contact person, contact group, monitoring time and monitoring command, etc. As can be seen from these definitions, the profiles of nagios are related to each other and refer to each other. To successfully configure a nagios monitoring system, it is necessary to understand the relationship between dependencies and dependencies between each profile. The most important thing is that there are four points.

N define and monitor those hosts, host groups, services and service groups

N define what commands to use to implement this monitoring

N define the time period for monitoring

N define the contact and contact ancestor to be notified when there is a problem with the host or server

3) configure nagios

In order to explain the problem more clearly, but also for the convenience of maintenance, it is recommended to create separate configuration files for each defined object of nagios.

N create a conf directory to define host hosts

N create a hostgroups.cfg file to define the host group

N define contacts and contact groups with the default contacts.cfg file

N define commands with the default commands.cfg file

N use the default timeperiods.cfg to define the monitoring period

N use the default templetes.cfg file as the resource reference file

3. Configure nagios

1) modify / usr/local/nagios/etc/nagios.cgf main configuration file

2) modify / usr/local/nagios/etc/objects/commands.cfg

Add the following (define check_nrpe monitoring commands)

3) modify / usr/local/nagios/etc/objects/contacts.cfg (define the contact of the monitoring server)

4) New / usr/local/nagios/etc/objects/hostgroups.cfg (define host group)

5) create a new 192.168.1.20.cfg file under / usr/local/nagios/etc/conf (used to monitor the host survival, load and process of 192.168.1.20) (all contents need to be entered manually)

The following picture is not finished:

The command explains:

Define host {

Use linux-server / / define the template used

Host_name nagios / / the name of the monitored host, preferably without spaces

Alias nagios / / alias

Address 127.0.0.1 / / IP address of the monitored host

Check_command check-host-alive

Normal_check_interval 3 / / normal detection interval

Retry_check_interval 2 / / retry detection interval

/ / Monitoring command check-host-alive, which comes from commands.cfg and is used to monitor whether the host is alive.

Max_check_attempts 5 / / number of retries after failed check

Time period 24x7 for check_period 24x7 / / check, also from the definition in timeperiods.cfg

Notification_interval 10 / / reminder interval, every 10 seconds

Notification_period 24x7 / / reminder cycle, 24x7, also from the definition in timeperiods.cfg

Contact_groups admins / / contact group, admins defined above in contactgroups.cfg

Notification_options dpencil uperior r / / specify when to remind you

}

Notify the contact person when the service has a w-alarm (warning), u-unknown (unkown), c-serious (critical), or r-returns to normal from an abnormal situation.

When the host has d-down (down), u-returns unreachable (unreachable), r-returns to normal from the abnormal situation, and notifies the contact in these three cases

6) restart the nagios service

7) found an error and prompted that no contact group was added. Solution: in the

Add the code at the end of the / usr/local/nagios/etc/objects/contacts.cfg file, as shown below:

8) restart the nagios server successfully

9) visit the web page to view the status

(note: turn off selinux or make an exception)

Or:

If you enable selinux, you need to configure the following two steps:

Chcon-R-t httpd_sys_content_t / usr/local/nagios/sbin/

Chcon-R-t httpd_sys_content_t / usr/local/nagios/share/

Click the localhost in the image above to view the status of this machine.

4. Configure the controlled terminal 192.168.1.20 (mysql and web)

1) install the nagios plug-in

Yum-y install openssl openssl-devel

Useradd nagios-s / sbin/nologin

Tar zxf nagios-plugins-1.5.tar.gz

Cd nagios-plugins-1.5

. / configure-- prefix=/usr/local/nagios

Make & & make install

Chown-R nagios:nagios / usr/local/nagios

Tar zxf nrpe-2.15.tar.gz

Cd nrpe-2.15

. / configure-- prefix=/usr/local/nagios

Make all & & make install-plugin & & make install-daemon

Make install-daemon-config

2) after the installation is complete, you need to open vim / usr/local/nagios/etc/nrpe.cfg

Add the address of the nagios server

3) start nrpe

4) when you test whether the nrpe is running properly on the nagios server, the following message indicates that it is correct.

5) visit the browser

5. Supplement

You can also add parameters from the 192.168.1.20.cgf file to the services.cfg file

# vi / usr/local/nagios/etc/objects/services.cfg

The contents are as follows:

Check the number of users currently logged on to the remote host. If the user is greater than 20, report warning, and if it is greater than 50, report critical.

Warning will be reported if the free space is less than 20%, and Critical if the free space is less than 10%:

Check the total number of processes in the remote host. If it is more than 250 processes, report warning. If it is greater than 400 processes, report critical,S (hibernation), R (run), Z (dead), D (uninterruptible), T (stop).

The meaning of this command is as follows: check_load-w 5, 4, 4, 3-c, 10, 10, 6, 4.

When there are more than 5 processes waiting in 1 minute, more than 4 in 5 minutes, and more than 3 in 15 minutes are warning status.

When there are more than 10 processes waiting in 1 minute, more than 6 in 5 minutes, and more than 4 in 15 minutes are critical status.

The service group is not required, it is the display of the monitoring page in conjunction with nagios

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report