Example Analysis of HDFS Reading and Writing 07/03 Update SLTechnology News&Howtos

Example Analysis of HDFS Reading and Writing

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the HDFS reading and writing example analysis, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand.

First, the premise of HDFS reading and writing

NameNode (metadata node): stores metadata (namespace, number of replicas, permissions, block list, cluster configuration information), excluding data nodes. The metadata node stores the file system metadata in memory.

1.DataNode (data node): the place where data is actually stored, in blocks. The default block size is 128m. The data node periodically sends all storage block information to the metadata node. The client communicates with the NameNode node, and then reads or writes the data to the data node.

2.SecondaryNameNode (Slave metadata Node): it is not a backup node for the metadata node, but works with the metadata node, which is different from the metadata node. SecondaryNameNode periodically merges the namespace image files and modification logs of metadata nodes to help metadata nodes store in-memory metadata information to disk.

3.Client (client): the client is the application program and interface that needs to obtain the files in the HDFS system, causing HDFS read / write and other operations.

It is worth noting that:

The actual 1.namenode client only uploads one datanode, and the other two are done by namenode. Let datenote copy it himself. Then the result is returned to namenode step by step after the copy is completed. If 2pr 3Datagram copy fails, then namenode assigns a new datanode address. For the client, you can upload a datanode by default, and the rest will be copied by datanode itself.

2.datanode slicing is done by the client. The upload of the second and third copies of datanode and the first upload are asynchronous.

Second, the writing process in HDFS:

1. The root namenode communication requests to upload the file, and the namenode checks whether the target file already exists and the parent directory exists.

2.namenode returns whether it can be uploaded.

3.client requests which datanode servers the first block should be transferred to.

4.namenode returns 3 datanode server ABC.

5.client requests one of the three dn A to upload data (essentially a RPC call to establish a pipeline). A receives the request and continues to call B, and then B calls C to complete the establishment of the real pipeline and return it to the client step by step.

6.client starts uploading the first block to A (first reading data from disk to a local memory cache). Taking packet as a unit, A receives a packet and passes it to B packet B to Cten A. each packet is put into a response queue to wait for a reply.

7. When a block transfer is complete, client again requests namenode to upload the server of the second block.

Third, the reading process in hdfs:

1. Communicate with namenode to query metadata and find the datanode server where the file block is located.

two。 Pick a datanode (nearest principle, then random) server and request the establishment of an socket stream.

3.datanode starts sending data. (read the data from the disk and put it into the stream and verify it in units of packet)

4. The client receives it in packet units, caches it locally, and then writes to the target file.

Thank you for reading this article carefully. I hope the article "sample Analysis of HDFS Reading and Writing" shared by the editor will be helpful to you. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.