Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Analysis of Cluster abnormal faults caused by 0023-HOSTS configuration problems

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Tips: To see the HD uncoded set of pictures, please use your mobile phone to open and click on the picture to enlarge.

1. problem phenomenon

Hadoop cluster HDFS, YARN, Hive and other services have abnormal alarms

Restart cluster exception alarms There are still a lot of alarms

Cluster 1

HDFS

Available Space Suppress...

NameNode health suppression... HDFS canary inhibition...

DataNode (ip-172-31-10-118) log file

NameNode Connection Suppression...

DataNode (ip-172-31-5-190) log file

NameNode Connection Suppression...

DataNode (ip-172-31-9-33) log file

NameNode Connection Suppression...

Hive Metastore Server (ip-172-31-6-148) log files

Hive Metastore Canary Inhibits...

Impala Daemon (ip-172-31-10-118) log file

Process state suppression...

Impala Daemon (ip-172-31-5-190) log file

Process state suppression...

Impala Daemon (ip-172-31-9-33) log file

Process state suppression...

NameNode (ip-172-31-6-148) log file

Safe mode state suppression...

Server (ip-172-31-5-190) log files

Quorum membership suppression...

Zookeeper service "Quorum membership" alert

The role logs for all services on CM nodes cannot be viewed properly through ClouderaManager console, displaying the following error:

2. problem recurrence

Cluster environment:

CDH5.12.0 Cluster Services (HDFS/Hive/YARN/Zookeeper/Hue/Impala/Kudu/Oozie)

1. Restore field configuration, all server hosts profile configuration

127.0.0.1 ip-172-31-10-156.ap-southeast-1.compute.internal127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4::1 localhost localhost.localdomain localhost6 localhost6.localdomain6172.31.8.141 ip-172-31-8-141.ap-southeast-1.compute.internal172.31.1.175 ip-172-31-1-175.ap-southeast-1.compute.internal172.31.9.186 ip-172-31-9-186.ap-southeast-1.compute.internal172.31.10.156 ip-172-31-10-156.ap-southeast-1.compute.internal

The first row in the configuration is configured as an extra exception configuration.

ping your own hostname on the host shows

2. Restart cluster services

CM has a large number of alarms as follows

Cluster 1HDFS Free Space Suppress... NameNode health suppression... HDFS canary inhibition... DataNode (ip-172-31-10-118) log file NameNode connection suppression... DataNode (ip-172-31-5-190) log file NameNode connection suppression... DataNode (ip-172-31-9-33) log file NameNode connection suppression... Hive Metastore Server (ip-172-31-6-148) Log Files Hive Metastore Canary Suppress... HiveServer2 (ip-172-31-6-148) log file process status suppression... Impala Daemon (ip-172-31-10-118) log file process status suppression... Impala Daemon (ip-172-31-5-190) log file process status suppression... Impala Daemon (ip-172-31-9-33) log file process status suppression... NameNode (ip-172-31-6-148) Log file security mode state suppression... Server (ip-172-31-5-190) Log File Quorum Membership Suppression... ip-172-31-10-118 Proxy status suppression... ip-172-31-5-190 Proxy status suppression... ip-172-31-9-33 Proxy status suppression...

Zookeeper is consistent with field alarms, and Zookeeper service status is as follows

The following exception occurred when viewing CM node logs: "Connection refused"

Host List Monitoring Status

3. cause of the problem

When the cluster is running normally, the hosts file of all nodes is modified to 127.0.0.1 resulting in

4. solution

Modify the hosts file for all nodes to change the configuration comment for line 127.0.0.1

Restart cluster service and return to normal;

Drunken whip famous horse, youth more pompous! Lingnan Huanxi Sand, vomiting wine shop! Friends refused to put, data play flowers!

Tips: To see the HD uncoded set of pictures, please use your mobile phone to open and click on the picture to enlarge.

It is recommended to pay attention to Hadoop practice, share more Hadoop dry goods in the first time, welcome to forward and share.

Original article, welcome to reprint, please indicate: Reprint from Weixin Official Accounts Hadoop Practical Operation

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report