How to deal with the imbalance of data disk size of DN nodes in Hdfs 07/19 Update SLTechnology News&Howtos

How to deal with the imbalance of data disk size of DN nodes in Hdfs

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article introduces the knowledge of "how to deal with the imbalance of data disk size of DN nodes in Hdfs". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Phenomenon description

When building a cluster, datanode's node data disks are made into a 7.2TB sdb1 (data1) with a total of four disks, and two sdc1 (data2) disks are made into a 3.6TB through the matrix, which are done by operation and maintenance for historical reasons. It was not found at first, and then after a period of time in the cluster, with the increase in the amount of data, it was found that many disks in the cluster exceeded the 90% alarm rate. Longtip set the disk alarm threshold to 90%. If you exceed the threshold, you will send an SMS or Wechat alarm, reminding us that the disk will be full for preprocessing, but the disk utilization obtained through hadoop monitoring metrics remains at 55%. In this case, alarm should not occur. The disk utilization can also be seen in hadoop's hdfs's namnode's web ui, as shown below:

At this time, people's suspicions will focus on the data storage of some datanode nodes in hdfs is too centralized, causing disk alarms on some nodes. But as we all know, when hdfs allows datanode nodes to access, the disks between datanode are heterogeneous, and the data storage hadoop will automatically balance between datanode. So this suspicion can be ruled out.

Log in to the alarm node and find that the data2 disk utilization is more than 90%, but the data1 utilization is less than 50%.

At this point, the problem is obvious. Before hadoop3.0, hdfs data storage only supported balancing between datanode nodes, but not between datanode internal disks.

So what should we do at this time?

At first

Lang Tip's idea is to split the data1 matrix into two matrices made up of two disks, and then re-scroll the online Datanode (data migration or balance it through replica changes). However, this method was quickly rejected because it was very simple. Hundreds of TB of data is balanced in the cluster, even if it is a rolling restart, so many machines will last for a long time, and then when the data is migrated or balanced, the bandwidth and disks of the entire group will be greatly burdened, resulting in reduced availability of the cluster.

Then

Through the official website of hadoop, it is found that hadoop 3.0 not only supports data balance between datanode, but also supports data balance among multiple disks managed by datanode.

At this time, you can consider upgrading the hadoop cluster to hadoop3.0, but think again and again that it is a waste of time and not cost-effective, and finally give up this scheme.

Last

After thinking about it, I finally came up with a very simple solution, which only needs to restart datanode to achieve a way to improve the utilization of large disks. The first thing to know is the datanode management disk, which is the directory specified according to our dfs.data.dir parameter. In that case, our idea is very simple, give data1 multiple directories, can not increase the probability of writing to it, and then improve the disk utilization. The configuration is as follows:

Dfs.data.dir / data1/dfs/dn,/data1/dfs/dn1,/data2/dfs/dn

After the configuration, restart the datanode cluster, check the size of the directory after a certain period of time, and then find that data is written.

This proves that the idea is feasible.

The disadvantage of this method is that the original data will not be balanced, and the way of adding directories only increases the probability that new data is written to a large disk, but this is fine, waiting for the original data to be deleted automatically.

This is the end of the content of "how to deal with the imbalance of data disk size of DN nodes in Hdfs". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.