What is the HDFS architecture? 04/28 Update SLTechnology News&Howtos

What is the HDFS architecture?

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces what the HDFS architecture is, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, let the editor take you to understand it.

The architecture of HDFS is Master/Slave structure. A typical HDFS usually consists of a single NameNode and multiple DataNode. NameNode is a central server responsible for file system namespace operations, such as opening, closing, and renaming files or directories, maintaining file path to block mapping, block to DataNode mapping, monitoring DataNode heartbeats and maintaining the number of block copies. The DataNode in a cluster is usually one node at a time and is responsible for managing the storage on the node on which it resides. HDFS exposes the namespace of the file system on which users can store data in the form of files. Internally, a file is actually divided into one or more blocks, which are stored on a set of DataNode. DataNode is responsible for handling read and write requests from file system clients. Create, delete and copy data blocks under the unified scheduling of NameNode.

All updates to the directory tree and changes to file names and block relationships must be persistent, as shown in figure 2 in HDFS:

Figure 2 Storage diagram of files in HDFS

HDFS involves the interaction between NameNode, DataNode, and clients. In essence, the client communicates with NameNode by obtaining or modifying the metadata of the file, and actually operates with DataNode. As shown in figure 3, there are three important roles in HDFS: NameNode, DataNode, and Client, where Client is the application that needs to get distributed file system files.

Here are three operations to illustrate the interaction between them.

(l) document writing. First of all, Client initiates a request for file writing to NameNode. According to the file size and file block configuration, NameNode returns to Client the information of the part of DataNode it manages. Client divides the file into multiple Block and writes to each DataNode block sequentially according to the address information of the DataNode.

(2) File reading. Client initiates a request for reading a file to NameNode, and NameNode returns information about the DataNode stored in the file. Client reads the file information on the DataNode based on the returned information.

(3) copy the file Block. NameNode finds that the Block of some files does not meet the minimum number of copies or part of the DataNode fails, and notifies DataNode to copy Block from each other. DataNode begins to copy each other directly after receiving the notification.

Fig. 3 HDFS structure diagram

Thank you for reading this article carefully. I hope the article "what is the HDFS Architecture" shared by the editor will be helpful to you. At the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.