What is the process of reading and writing HDFS files in Hadoop? 07/04 Update SLTechnology News&Howtos

What is the process of reading and writing HDFS files in Hadoop?

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the Hadoop HDFS file reading and writing process is how, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor with you to understand.

I. description of the document reading process

The read operation is transparent to the Cient side and feels like a continuous data stream.

1. Client communicates with NameNode on RPC through FileSystem.open (filePath) method, and returns the part of the file.

Or all block lists, that is, return FSDatainputstream objects

2. Client calls the read () method of the FSDatainputStream object

a. Go to read with the first nearest DN, and after reading it will check; if ok will turn off communication with the current DN; check fail

Will record the failed block+DN information that will not be read next time, and then read the second DN address

b. The second block is read on the nearest DN, and the communication with DN is turned off after check.

C. Block list has been read, and the file is not finished yet. FileSystem will get the next batch of block list from NameNode.

3. The Client bar closes the input stream using the close method of the FSDatainput object

Summary

Client > filesystem.open () returns get block list for RPC communication with NameNode

Client > call the inputstream object read () method

If ok > close DN communication call the inputstream.close () method to close the input stream

If fail > record DN and block information, and read the last close () to the second DN

Block list read out, file over year > filesystem to get the next batch block list

II. Description of the document writing process

1. Client calls the FileSystem.create (filepath) method to communicate with NameNode on RPC. Check whether the file in this path exists and has the permission to create the file. If ok creates a new file, but does not associate any block, a FSDataOutputStream object is returned.

2. Client calls the write () method of the FSDataOutputStream object, writes the first block to the first DataName, then passes it to the second node, the third node returns an ack packet to the second node, the second node returns the first node, and the first node returns the ack packet to the FSDataOutputstream object, which means that the first block is written, and the number of copies is 3; the rest of the blocks are written in turn.

3. After the data is written into the file, Client calls the FSDataOutputStream.close () method to close the output stream and refresh the data packet in the cache.

4. Finally, call the FileSystem.complate () method to tell the NameNode node that the write was successful.

Summary: File.System.create () method > NameNode check (qx and exists)

If ok > return FSDataOutStream object | if fail > return error

Client calls FSDataOutstream.write () method > writes DN,teturn ack packet > FSDataOutStream object

Client calls the FSDataOutstream.close () method to close the output stream > flush cache

Finally, the FileSystem.complate () method > NameNode write ok

Thank you for reading this article carefully. I hope the article "what is the reading and writing process of HDFS documents in Hadoop" shared by the editor will be helpful to you. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.