What's the use of NameNode? 07/19 Update SLTechnology News&Howtos

What's the use of NameNode?

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

Xiaobian to share with you what is the use of NameNode, I believe most people still do not know how, so share this article for your reference, I hope you have a lot of harvest after reading this article, let's go to understand it together!

Role of NameNode

NameNode is the brain of the file system, managing file namespaces and access to files in the cluster, and storing metadata. The two most important mappings saved are file names & data blocks (saved on disk, persistent), and data blocks &DateNode lists (NameNodes are not saved, passed through DataNodes).

NameNode can communicate with client, DataNode and NameNode by implementing ClientProtocol,DataNodeProtocol and NameNodeProtocol interfaces respectively.

Analysis of Reading and Writing Process of Two Files

1 File reading process

First, the client uses FileSystem.open() function to open the file. DistributedFileSystem uses RPC to communicate with NameNode to obtain the data block information of the file. For each data block, the metadata returns the address of the data node storing the data block.

DistributedFileSystem then returns FSDataInputStream to the client to read the data, and the client calls Stream's read() method to begin reading the data.

DFSInputStream connection saves the nearest data node of the first data block of this file. Data is read from the data node to the client. When the data block is read, DFSInputStream closes the connection with this data node and connects the nearest data node of the next data block. When all nodes have been read, call the close function of FSDataInputSteam to close.

In the reading process, if an error occurs in the communication between the client and the data node, the next node is directly read and recorded.

2 File writing process

First, the client calls the create() method to create a file, DistributedFileSystem calls RPC to communicate with NameNode to convey that the client wants to create a new file, the metadata node checks the namespace to make sure that the file does not exist, and the client has the right to create the file, and then creates the file. Returns DFSOutStream, which causes the client to write data.

DFSOutputStream divides the data into blocks and writes them to the data queue. Because the data is streamed, the data queue is read by the Data Stream and notifies other data blocks (assuming 3 blocks are copied by default). The allocated data nodes are placed in a pipeline.

The Data Streamer writes blocks of data to the first data node in the pipeline, which in turn writes blocks of data to the second data node, which in turn writes data to the third data node. DFSOutput Stream saves an ack queue for the data block sent out, waiting for the data node in the pipeline to inform it that the data was written successfully. If the data node fails during the write process, close the pipeline and place the data block in the ack queue at the beginning of the data queue. Failed data nodes are removed from the pipeline and additional data blocks are written to the other two data nodes in the pipeline. Metadata is notified that there are not enough copy blocks and a third backup will be created.

When the client finishes writing data, the close() function of stream is called

Java API basic operations on files

1 FileSystem class (open file system)

Configuration conf=new Configuration();//Get configuration information FileSystem fs=FileSystem.get(URI.create(uri),conf);//Get address InputStream input=null;input=fs.open(new Path(uri));IOUtils.copyBytes(input,System.out,4096,false);//Read file, print IOUtils.closeStream(input);

2 FileStatus class (view file status)

//View meta information for files or directories in HDFS. FileStatus fstus=fs.getFileStatus(new Path(uri));fstus.getPath();//Get file path fstus.getLen(); //Get file length fstus.ModificationTime()//Get latest modification time fstus.getReplication()//Get file backup tree fstus.getOwner()//Get file owner

3 Block Location (view the location of the block)

//Find the location of a file Block in the HDFS cluster. FileStatus fstus=fs.getFileStatus(new Path(uri));BlockLocation []block=fs.getFileBlockLocations(fstus,0,fstus.getLen));

4 Check if the file exists

//List all files under HDFS. Check if it exists, using the exist() method. Path []paths=new Path[args.length];FileStatus fstus=fs.listStatus(paths);Path []listedPaths=FileUtil.stat2Paths(fstus);for(Path p:listedPaths){ System.out.println(p);} The above is "NameNode what to use" all the content of this article, thank you for reading! I believe that everyone has a certain understanding, hope to share the content to help everyone, if you still want to learn more knowledge, welcome to pay attention to the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.