In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
CM HDFS capacity display is not consistent with the actual command analysis, I believe that many inexperienced people do not know what to do, so this paper summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.
Warm Tip: if using a computer to view the picture is not clear, you can use your mobile phone to open the article and click on the picture in the article to enlarge to see the original picture in high definition.
1. Problem description
Through Cloudera Manager, we can see that the capacity usage of HDFS is shown as 103.9GB.
Through the 50070 interface of HDFS, we can see that the capacity usage of HDFS is 41.63GB.
Use the hadoop fs-du-h / command to view the use of HDFS. The use of HDFS is 41.63GB.
Question: why does the HDFS space shown on Cloudera Manager use much more space than HDFS actually uses?
two。 Analysis of problems
Hovering over the HDFS capacity configuration over the Cloudera Manager will display instructions for capacity usage, as shown in the following figure:
The CM shows that the HDFS configuration capacity consists of two parts: the space used by DFS and the space used by non-DFS.
Next, let's take a look at the statistics of the 50070 interface of HDFS, we can see that there are two messages: DFS Used and Non DFS Used.
The addition of DFS Used and Non DFS Used data is exactly the same as the configured capacity 103.9GB displayed on Cloudera Manager.
Here comes the question again. What is the space of "Non DFS Used"? how is "Non DFS Used" calculated?
3.Non DFS Used description
Here Fayson uses a node cdh03 of the cluster to illustrate. The following is the disk mount information of the cdh03 node, and the / data/ disk1 disk is the data directory configured by HDFS.
1. In the DataNode configuration of HDFS, "dfs.datanode.du.reserved" is used to reserve a certain amount of space for the data disk of HDFS. Default is 10GB.
So the space for HDFS to use the disk is 100GB-9.99GB=90GB.
two。 Use the hadoop dfsadmin-report command to view the usage of each node in the HDFS space
As shown in the screenshot above, the cdh03.fayson.com node DFS usage report contains the total capacity, used capacity, available capacity and "Non DFS Used" of the DFS.
The calculation method of 3.Non DFS Used is about
Total disk capacity-Node reserved capacity (dfs.datanode.du.reserved)-DFS Used-DFS Remaining is "Non DFS Used"
100GB-10GB-13.88GB-57.55GB ≈ 18.03GB
So according to the above conclusion, when we reserve 10GB for the data disk for the system or other non-HDFS files storage space, then the use of DFS space is 90GB, but non-HDFS files occupy more space than 10GB will occupy DFS configuration 90GB space, so "Non DFS Used" is to occupy that part of the DFS capacity space.
4. Summary
The HDFS capacity configuration shown in Cloudera Manager is divided into two parts: space used by DFS and space used by non-DFS.
The "Non DFS Used" space is the part of the space occupied by non-HDFS files in each disk of the DN node after excluding the reserved space (such as kudu data, Kafka data, user's own data, etc.).
After reading the above, have you mastered the method of analyzing the problem that the HDFS capacity display on CM is inconsistent with the actual command? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.