How does HDFS do file management and fault tolerance 07/16 Update SLTechnology News&Howtos

How does HDFS do file management and fault tolerance

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how HDFS does file management and fault tolerance". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how HDFS does file management and fault tolerance".

HDFS file management 1. Block distribution of HDFS

HDFS splits the data file into small blocks for storage, and saves multiple copies of these blocks to different DataNode. The number of copies of blocks in HDFS is determined by the dfs.replication attribute in the hdfs-site.xml file, which is configured as follows:

Dfs.replication 3

The default number of replicas for Hadoop is 3, and there is also a certain strategy for rack storage. The default layout policy of Hadoop, that is, the default copy storage policy, is as follows:

(1) the 1st copy is stored on the same node as the HDFS client.

(2) the second copy is stored on a different rack from the first copy and is a randomly selected node.

(3) the third copy is stored on the same rack as the second copy and is on a different node.

2. Data reading

The data reading process of HDFS requires the client to access NameNode to obtain metadata information, and then read the data on the specific DataNode, as shown in the following figure:

(1) the client initiates a request to NameNode to read the metadata information. The NameNode stores metadata information for the entire HDFS cluster, including file names, owners, groups, permissions, blocks, and DataNode lists.

In this process, we also need to verify the identity information of the client, at the same time detect whether there are files to read, and need to verify that the identity of the client has access rights.

(2) NameNode returns the relevant metadata information to the client.

(3) the client reads the corresponding data block on the specified DataNode.

(4) DataNode returns the corresponding block information.

Steps (3) and (4) continue until all blocks of the file have been read or the HDFS client actively closes the file stream.

3. Data writing

The data writing process in HDFS also requires the client to access NameNode, obtain metadata information, and then write data to the specific DataNode, as shown in the figure.

Here are the specific steps:

(1) the client requests NameNode to obtain metadata information. In this process, NameNode verifies the province information of the client and verifies whether the identity of the client has write permission.

(2) NameNode returns the corresponding metadata information to the client.

(3) client writes data to the first DataNode.

(4) the first DataNode writes data to the second DataNode.

(5) the second DataNode writes data to the third DataNode.

(6) the third DataNode returns confirmation result information to the second DataNode.

(7) the second DataNode returns confirmation result information to the first DataNode.

(8) the 1st DataNode returns confirmation result information to the client.

Among them, steps (4) and (5) are executed asynchronously, and when multiple DataNode in the HDFS fails or an error occurs, the HDFS client can recover data from the copy of the data block as long as the number of data copies that meet the minimum requirement is written correctly.

The minimum number of required data copies is determined by the dfs.namenode.replication.min attribute in the hdfs-site.xml file, which is configured as follows:

Dfs.namenode.replication.min 1

The minimum number of required data copies defaults to 1, which means that as long as a copy of the data is written correctly, the client can recover the data from the data copy.

4. Data integrity

In general, the following methods can be used to verify whether the data is corrupted.

(1) when data is introduced for the first time, the checksum is calculated.

(2) when the data goes through a series of transmission or replication, the checksum is calculated again.

(3) compare whether the checksum of steps (1) and (2) is consistent, and if the checksum of the two data are inconsistent, it is proved that the data has been destroyed.

Note: this technique of using checksums to validate data can only detect whether the data is corrupted, not repair it.

Checksum technology is also used in HDFS to check whether the data is damaged. Whether it is writing or reading data, the checksum of the data will be verified. The number of bytes of the checksum is specified by the io.bytes.per.checksum attribute in the core-site.xml file, and the default byte length is 512B. The configuration is as follows:

Io.bytes.per.checksum 512

When HDFS writes data, the HDFS client will send the data to be written and the checksum of the corresponding data to the replication pipeline composed of DataNode, and the last DataNode is responsible for verifying whether the checksum of the data is consistent. If a checksum is detected to be inconsistent with the checksum sent by the HDFS client, the HDFS client will receive a checksum exception, which can be caught in the program and handled accordingly, such as rewriting data or dealing with it in other ways.

Checksums are also validated when HDFS reads the data, and they are compared with the checksums stored in DataNode. If it is inconsistent with the checksum stored in DataNode, the data has been corrupted and needs to be re-read from another DataNode. Each DataNode saves a checksum log, and after the client successfully validates a block, DataNode updates the checksum log.

In addition, each DataNode runs a scanner (DataBlockScanner) in the background to periodically validate all data blocks stored on that DataNode.

Due to the block copy mechanism provided by HDFS, when a data block is damaged, HDFS can automatically copy other intact data blocks to repair the damaged data blocks, and get a new, intact data block to meet the requirements of the number of copies set by the system, so when some data blocks are damaged, it ensures the integrity of the data.

5. HDFS fault tolerance

Generally speaking, the fault tolerance mechanism of HDFS can be divided into two aspects: the fault tolerance of file system and the fault tolerance of Hadoop itself.

5.1 Fault tolerance of file system

The fault tolerance of file system can be realized by NameNode high availability, SecondaryNameNode mechanism, block copy mechanism and heartbeat mechanism.

Note: when deploying Hadoop in local mode or pseudo-cluster mode, there will be SeconddayNameNode; when deploying Hadoop in cluster mode, if the HA mechanism of NameNode is configured, then there will be no SecondaryNameNode, and there will be a backup NameNode.

Here we focus on the fault tolerance of HDFS in cluster mode. For more information about SecondaryNameNode mechanism, please see the previous article, "Front High Energy | HDFS Architecture. Have you eaten through it?" Description of ":

The fault tolerance mechanism of HDFS is shown in the figure:

The specific process is as follows:

(1) the slave NameNode backs up the metadata information on the master NameNode in real time. If the master NameNode fails and becomes unavailable, the slave NameNode will quickly take over the work of the master NameNode.

(2) the client reads metadata information to NameNode.

(3) NameNode returns metadata information to the client.

(4) the client reads / writes data to DataNode, which can be divided into two cases: reading data and writing data.

① reads data: HDFS checks the integrity of the file block, confirms that the file block is checked and consistent, and if not, obtains the appropriate copy from other DataNode.

② write data: HDFS detects the integrity of file blocks and records the checksum of all file blocks for the newly created file.

(5) DataNode will regularly send heartbeat information to NameNode and inform NameNode;NameNode of the status of its own node. It will put the commands to be executed by DataNode into the return result of heartbeat information and return them to DataNode for execution.

When DataNode fails to send heartbeat information normally, NameNode detects whether the number of copies of the file block is less than the system setting value, and if it is less than the set value, it automatically copies the new copy and distributes it to other DataNode.

(6) data replicas between DataNode with data association in the cluster.

When the DataNode in the cluster fails, or when a new DataNode is added to the cluster, the data may be unevenly distributed. When the free space resources on one DataNode are greater than the threshold set by the system, the HDFS migrates the data from other DataNode. On the other hand, if the resources on a DataNode are overloaded, HDFS will find the DataNode with free resources according to certain rules and migrate the data there.

There is also a mechanism that HDFS supports fault tolerance from the side, that is, when data is deleted from HDFS, the data will not be deleted from HDFS immediately, but will be placed in the "Recycle Bin" directory and can be recovered at any time, and the data will not be really deleted until a certain amount of time has elapsed.

5.2 Hadoop's own fault tolerance

Hadoop's own fault tolerance is relatively simple to understand. When upgrading a Hadoop system, if the Hadoop version is incompatible, you can roll back the Hadoop version to achieve its own fault tolerance.

Thank you for reading, the above is the content of "how HDFS does document management and fault tolerance". After the study of this article, I believe you have a deeper understanding of how HDFS does file management and fault tolerance, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.