11. The datanode working mechanism of hdfs 09/21 Update SLTechnology News&Howtos

11. The datanode working mechanism of hdfs

2025-09-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

1. The working mechanism of datanode 1. Basic process

1) after datanode starts, it registers with namenode according to the namenode address specified in the configuration file. 2) namenode returns successful registration 3) since then, datanode will periodically report all block information to namenode (default is 1 hour 4) at the same time, datanode will send heartbeat information to namenode every 3 seconds, and the heartbeat result returned by namenode will have the command namenode to the datanode, such as copying block data to another machine or deleting a data block. If you do not receive a heartbeat message from a datanode for more than 10 minutes (default), the node is considered unavailable. 5) you can safely join and exit some datanode machines during the cluster operation.

2. Basic directory structure

The directory structure of namenode is created by initializing hdfs namenode-format manually, while that of datanode is created automatically at startup without manual formatting. And even if the directory structure of namenode is formatted on datanode, these formatted directories are useless as long as namenode is not started in datanode. The general datanode directory is under ${hadoop.tmp.dir} / dfs/data. Look at the directory structure.

Data

├── current

│ ├── BP-473222668-192.168.50.121-1558262787574 named after poolID

│ │ ├── current

│ ├── dfsUsed

│ ├── finalized

│ └── subdir0

│ │ └── subdir0

│ │ ├── blk_1073741825

│ │ ├── blk_1073741825_1001.meta

│ │ ├── blk_1073741826

│ │ ├── blk_1073741826_1002.meta

│ │ ├── blk_1073741827

│ │ ├── blk_1073741827_1003.meta

│ ├── rbw

│ └── VERSION

│ │ ├── scanner.cursor

│ │ └── tmp

│ └── VERSION

└── in_use.lock

(1) the contents of / data/current/VERSION file are as follows:

# id of datanode, which is not globally unique and useless

StorageID=DS-0cb8a268-16c9-452b-b1d1-3323a4b0df60

# Cluster ID, globally unique

ClusterID=CID-c12b7022-0c51-49c5-942f-edc889d37fee

# creation time, it's useless

CTime=0

# unique identification code of datanode, globally unique

DatanodeUuid=085a9428-9732-4486-a0ba-d75e6ff28400

# Storage type is datanode

StorageType=DATA_NODE

LayoutVersion=-57

(2) / data/current/POOL_ID/current/VERSION

# ID of the interfacing namenode

NamespaceID=983105879

# create timestamp

CTime=1558262787574

# pool id used

BlockpoolID=BP-473222668-192.168.50.121-1558262787574

LayoutVersion=-57

(3) / data/current/POOL_ID/current/finalized/subdir0/subdir0 this directory is actually storing blocks of data. A block is mainly divided into two file stores:

Blk_$ {BLOCK-ID}

Blk_$ {BLOCK-ID} _ xxx.meta

For directories:

Blk_$ {BLOCK-ID}:

Is an xml format file that records operation logs similar to edits files, such as:

-63

OP_START_LOG_SEGMENT

twenty-two

OP_MKDIR

twenty-three

16386

/ input

1558105166840

Root

Supergroup

four hundred and ninety three

Blk_$ {BLOCK-ID} _ xxx.meta:

Is a raw G3 data, byte-padded format file, which mainly stores inode records in the directory

For files:

Blk_$ {BLOCK-ID}:

The actual data in block is recorded.

Blk_$ {BLOCK-ID} _ xxx.meta:

CRC32 check file to save check information for data blocks

3. Verify block integrity

1) when datanode reads the block, the accounting calculates its checksum. If it is different from the checksum when the block was created, it proves that the block on the current datanode has been corrupted. At this point, client will want to read the block from other datanode nodes that store the block. 2) after creating the block, datanode periodically checks the block for corruption, which is also achieved by checking the checksum.

4. Set the datanode timeout parameter to set the death of the datanode process, or if the datanode cannot communicate with the namenode due to a network failure, namenode will not immediately determine the datanode as dead, but if the datanode has no heartbeat information after a period of time, it will be judged as dead. The formula for calculating the timeout time is:

Timeout = 2 * dfs.namenode.heartbeat.recheck-interval + 10 * dfs.heartbeat.interval

Dfs.namenode.heartbeat.recheck-interval: the interval for namenode to check whether datanode is alive. The default is 5 minutes, in milliseconds.

The interval between uploading heartbeat information on dfs.heartbeat.interval:datanode. Default is 3 seconds, and unit is seconds.

Both are set in hdfs-site.xml

5. Multi-directory configuration of datanode

The multi-directory configuration of datanode is different from that of namenode. The data between multiple directories is different. Instead, the block data is divided into two parts, which are placed in two directories. The configuration is as follows:

/ / hdfs-site.xml

Dfs.datanode.data.dir

File:///${hadoop.tmp.dir}/dfs/data1,file:///${hadoop.tmp.dir}/dfs/data2

6. About the actual size of block

Although the size of each block is 128m (hadoop2.x), it still occupies 128m even if the actual size of the stored data is not 128m. But when it is actually stored on disk, it takes up the actual size of the data, not 128m. Because the block of the physical disk is 4KB by default, it is impossible to occupy 128m in vain.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.