Example Analysis of metadata in HDFS 02/10 Update SLTechnology News&Howtos

Example Analysis of metadata in HDFS

2026-02-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you the "sample Analysis of metadata in HDFS", which is easy to understand and well-organized. I hope it can help you solve your doubts. Let me lead you to study and learn the article "sample Analysis of metadata in HDFS".

We all know that the bottom of Hadoop is HDFS-Hadoop Distributed File System. That is, Hadoop distributed file system.

All operations are based on HDFS files, and its core keywords are: master-slave NameNode VS DataNode.

-where metadata is stored on the NameNode-meta information that describes the data file.

The form of existence is: memory information + hard disk file information.

In the meantime, let's take a look at the metadata of HDFS and the essence of the HDFS file system.

Just imagine, if we were to design a file system ourselves, what information would we need metadata to store?

In fact, it depends on what functions the information gets?

Personally, I think the functions include:

1) IP, port, folder, capacity and other information of namenode and all datanode. This is equivalent to an overall description of the file system framework.

2) the hierarchical description of each datanode file and the file directory relationship. This is more detailed than 1.

3) for a file, know how many pieces have been split, the size of each piece, the backup situation, which datanode and which paths are distributed respectively.

From 1 we can get the skeleton of the distributed file system framework, from 2 we can get the flesh and blood of the distributed file system, and from 3 we can get the specific way to access a file.

With the above three, it is actually a part of the metadata information of namenode, and all the design drawings can be obtained from the requirements of the application.

At startup, the metadata is read from the hard disk into memory by FSImage.

When persisted, the metadata is persisted to the hard disk by FSImage.

At the same time, the hard disk will also store the operation log edits. My current understanding is to add up the actions of the operation log-"the final metadata."

This is just like doing it in redis. Many databases also operate in this way, this is nothing to say, very simple answer!

HDFS also introduces INode, which is actually the same as INode in the file system in linux, and the second is because it is a distributed file system

So the shard of each file becomes Block in hdfs. Block.

It must be emphasized here: what problems will be introduced when blocks are divided according to physical size, not branch logical size?

It is that a logical record may be divided into two blocks, and these two blocks may still be cross-machine.

All these hadoop will be solved. We'll talk about it later.

The above is all the content of the article "sample Analysis of metadata in HDFS". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.