In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
1. Data blocks (blck)
The default most basic unit of storage for HDFS (Hadp Distributed File System) is 64m blocks of data.
Like a normal file system, files in HDFS are stored in 64m blocks.
Unlike ordinary file systems, in HDFS, if a file is less than the size of a block, it does not take up the entire block storage space.
Metadata Node (Namende) and data Node (datande)
Metadata nodes are used to manage the namespace of the file system
It saves the metadata of all files and folders in a file system tree.
This information will also be saved to the following files on the hard drive: namespace image (namespace p_w_picpath) and modification log (edit lg).
It also saves which data blocks are included in a file and which data nodes are distributed. However, this information is not stored on the hard disk, but is collected from the data node when the system is started.
A data node is the place where data is actually stored in the file system.
The client (client) or metadata information (namende) can request from the data node to write or read the data block.
It periodically reports its stored block information to the metadata node.
Slave metadata node (secndary namende)
The slave metadata node is not a backup node when there is a problem with the metadata node, it is responsible for different things from the metadata node.
Its main function is to periodically merge the namespace image file of the metadata node and the modification log to prevent the log file from being too large. This point will be described in more detail below.
The merged namespace image file is also saved from the metadata node in case the metadata node fails.
1. Metadata node folder structure
The VERSIN file is a java prperties file that holds the version number of the HDFS.
LayutVersin is a negative integer that holds the format version number of HDFS's persistent data structure on the hard disk.
NamespaceID is the unique identifier of the file system and is generated when the file system is first formatted.
CTime is 0 here
StrageType indicates that what is stored in this folder is the data structure of the metadata node.
NamespaceID=1232737062cTime=0strageType=NAME_NDElayutVersin=-18
2. File system namespace image file and modification log
When the file system client (client) writes, it is first recorded in the modification log (edit lg)
The metadata node holds the metadata information of the file system in memory. After the modification log is recorded, the metadata node modifies the data structure in memory.
The modification log is synchronized (sync) to the file system before each write is successful.
The fsp_w_picpath file, that is, the namespace image file, is the checkpint of the metadata in memory on the hard disk. It is a serialized format and cannot be modified directly on the hard disk.
Similar to the mechanism of data, when the metadata node fails, the metadata information of the latest checkpint is loaded into memory from fsp_w_picpath, and then the operations in the modification log are re-performed one by one.
The slave metadata node is used to help the metadata node checkpint the metadata information in memory to the hard disk.
The process of checkpint is as follows:
The metadata node is notified from the metadata node to generate a new log file, and subsequent logs are written to the new log file.
From the metadata node, use http get to get the fsp_w_picpath file and the old log file from the metadata node.
Load the fsp_w_picpath file into memory from the metadata node, perform the operations in the log file, and then generate a new fsp_w_picpath file.
Http pst the new fsp_w_picpath file back to the metadata node from the metadata node
The metadata node can replace the old fsp_w_picpath file and the old log file with the new fsp_w_picpath file and the new log file (generated in the first step), and then update the fstime file to write the time of the checkpint.
In this way, the fsp_w_picpath file in the metadata node stores the latest checkpint metadata information, and the log file starts over and will not become very large.
3. Directory structure of slave metadata node
4. Directory structure of data nodes
The VERSION file format of the data node is as follows:
NamespaceID=1232737062storageID=DS-1640411682-127.0.1.1-50010-1254997319480cTime=0storageType=DATA_NODElayoutVersion=-18
Blk_ saves the data blocks of HDFS, in which the specific binary data is saved.
Blk_.meta saves the attribute information of the data block: version information, type information, and checksum
When the number of blocks in a directory reaches a certain number, a subfolder is created to hold the block and block attribute information.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
Reference link: http://blog.bwphp.cn/?p=617
© 2024 shulou.com SLNews company. All rights reserved.