In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the knowledge of "case Analysis of Orabbix Monitoring failure in Zabbix". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Since the use of Orabbix to monitor Oracle, a lot of work can be handled in this configuration controllable way, some problems are potential problems, some are legacy problems, and more or less improve efficiency.
Recently, when it comes to the relocation of the computer room, our Zabbix server is also in the migration plan, and because the scale of the deployment is not large, Orabbix and Zabbix Server are put together. As a result, the problem arises after the relocation. Under the premise that the network firewall is opened after the relocation, the monitoring Zabbix Agent at the system level is normal, while the originally available Orabbix now does not have any monitoring information.
In the case of this basic failure of monitoring, I always receive such alarm messages, which will be very sensitive to a core business.
ZABBIX- monitoring system:
-
Alarm content: Alive xxxx
-
Alarm level: PROBLEM
-
Monitoring project: Alive:0
-
Alarm time: July 21, 2017-22:25:40
Looking at the log information of Orabbix, it is found that when the connection is normal, a null pointer exception will be thrown.
[root@orabbx_monitor logs] # tail-f orabbix.log
2017-07-22 23 maxIdleSize=1 1514 43168 [main] INFO Orabbix-maxIdleSize=1
2017-07-22 23 maxIdleTime=1800000ms 1514 43168 [main] INFO Orabbix-maxIdleTime=1800000ms
2017-07-22 23 poolTimeout=100 1514 43168 [main] INFO Orabbix-poolTimeout=100
2017-07-22 23 timeBetweenEvictionRunsMillis=-1 1514 43168 [main] INFO Orabbix-timeBetweenEvictionRunsMillis=-1
2017-07-22 23 numTestsPerEvictionRun=3 1514 439 [main] INFO Orabbix-numTestsPerEvictionRun=3
2017-07-22 23 Connected as ORABBIX 1543774 [main] INFO Orabbix-Connected as ORABBIX
2017-07-22 23 on Database 1514 43778 [main] INFO Orabbix-- on Database-> test
ERROR Orabbix-Error on dbJob for database test QueryList error: java.lang.NullPointerException
INFO Orabbix-Done with dbJob on database testQueryList elapsed time 1089 ms
In this case, analyzing the problem also becomes difficult, because it is not clear what the problem is, whether it is Zabbix Server, Agent or Orabbix itself.
This null pointer exception is very vague. From this information, we can basically conclude that there is no problem with Zabbix Server. If there is a problem, the system monitoring repair of Zabbix Agent will fail directly, and the role of Orabbix is somewhat similar to a Zabbix AGent. In essence, it uses JDBC to send SQL to meet the monitoring requirements.
So my attention naturally turned to Orabbix. First of all, I reduced the list of monitored databases to one or two to facilitate troubleshooting.
After some investigation, the general harvest is that the Zabbix Agent on the Zabbix Server side has not been started, and Orabbix is still needed. Another is that due to the relocation of the server, the IP information has changed, so the original local firewall information needs to be supplemented. For example, you have to open port 10050 access to yourself. Because this server is Zabbix Server, it is also a server, so you have to monitor yourself, while Orabbix needs this Zabbix Agent, and it is also very important to adjust the IP information in / etc/hosts.
After doing this, restart Orabbix, found that the problem remains, restart Zabbix Agent,Zabbix Server, the problem remains the same.
And I found that the log information is very simple, after turning on the Debug mode, the log information is more, but it is all failure information, and no valuable information has been found for the time being.
So I decided to determine the boundary of the problem by comparison.
Although the structure of Orabbix is simple in nature, the official graphics are as follows:
There is no progress in the current situation, and the monitoring items at the database level have not taken effect, so one of the key directions is to ensure that Orabbix is available first. What to do? there is always no progress in debugging in the current environment, so I will simply build a new set. It will take about 10 minutes to build, install Java, decompress the orabbix package, and start.
The configuration of connection information between database and Orabbix is controlled by config.properties files in Orabbix, while the information of monitoring items is controlled by query.properties, and in Zabbix, the monitoring items of orabbix are controlled by templates, so the problem analysis of Orabbix is mainly through these three files to get more information.
After configuring Orabbix, I left the monitoring items by default and found that Orabbix was available, which means that there is nothing wrong with the config.properties file.
When I replaced the monitor file query.properties with the current file, starting Orabbix actually threw the initial error, so it shows that the problem lies in the query.properties file.
So how to locate the problem? after I restored the query.properties file, the monitoring returned to normal, but I customized a large number of monitoring items, which are not in the default template, whether there is something wrong with the monitoring template.
In this case, I made a neutralization, that is to use the default template, and then first added a custom monitoring item, only to find that the monitoring item could not get the data. The error message in Zabbix is as follows:
It seems to be caused by a data type mismatch. I changed the data type to text, and finally found that the output of the monitoring item was an empty character, so there was a problem converting the type.
How was it good before? was there a problem when Orabbix pushed the data to Zabbix? Monitoring items are pushed using Zabbix trapper, so we can use zabbix_sender to push a message to see if it is successful, such as the following command.
. / zabbix_sender-z 10.129.xx.xx-p 10051-s "test"-k db_time-o "test"
Info from server: "processed: 1; failed: 0; total: 1; seconds spent: 0.000051"
Sent: 1; skipped: 0; total: 1
Where 10.129.xx.xx is the information of IP, 10051 is the information of port, test is the instance name of the corresponding database, that is, a hostname in the corresponding Zabbix, db_time is the name of the monitoring item, and the last-o "test" is the message sent is test.
The results show that this message push is still normal, so we gradually narrow the scope of investigation and basically rule out the problem of the template, because there is no problem with the information push.
So there is a problem with the query.props file in a circle. The problem of null pointer looks weird, but we can first compromise in this case, such as configuring several high-priority monitoring items to let the monitoring take effect, and then make detailed adjustments to the monitoring list later. One thing is certain that the Orabbix has not been stopped since it was started, so it does not rule out whether the Orabbix's own checking mechanism fails some verification after reboot.
So I successfully debugged the Orabbix file query.properties on the new server, and after the backup, I copied it to the original directory, so that the connection information remains the same, the template remains the same, most of the monitoring items are retained, and the monitoring of the whole Orabbix runs again.
For example, the following DB time monitoring
This is the end of the content of "case Analysis of Orabbix Monitoring failure in Zabbix". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.