II. Hdfs architecture 07/19 Update SLTechnology News&Howtos

II. Hdfs architecture

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

[TOC]

I. Overview of the composition of HDFS system

This is a distributed file system that is suitable for scenarios where multiple reads are written at a time. Includes the following roles:

NameNode (nn): stores the metadata of a file, such as file name, file directory structure, file attributes, etc., as well as the list of blocks of each file and the DataNode where the blocks are located. And respond to the client's read and write operations on hdfs, such as creating directories, uploading files, and so on. And keep read and write logs. DataNode (dn): storage of block data in the local file system, and checksum SecondaryNameNode of block data (snn): a secondary daemon used to monitor the status of HDFS, taking snapshots of HDFS metadata at regular intervals, which is equivalent to backing up NameNode. II. HDFS-NameNode

The primary responsibility is to manage all nodes of the hdfs.

1. Respond to the client's request for hdfs, such as adding, deleting, modifying and querying.

2. Manage and maintain hdfs metadata information and logs (non-log logs)

NameNode creates a directory in the directory of hadoop.tmp.dir specified in core-site.xml: dfs/name/. Let's take a look at the structure of this directory.

[root@bigdata121 tmp] # tree dfs/namedfs/name ├── current │ ├── edits_0000000000000000001-00000000000000002 │ ├── edits_0000000000000000003-0000000000000000004 │ ├── edits_0000000000000000005-000000000000000000006 │ ├── edits_0000000000000000007-000000000000008 │ ├── edits_0000000000000000009-000000000000009 │ ├── edits_0000000000000000010-00000000000000011 │ ├── edits_0000000000000000012-00000000000013 │ ├── edits_0000000000000000014-00000000000015 │ ├── edits_0000000000000000016-00000000000017 ── edits_0000000000000000018-00000000000000019 │ ├── edits_0000000000000000020-000000000000021 │ ├── edits_0000000000000000022-00000000000000024 │ ├── edits_0000000000000000025-00000000000000026 │ ├── edits_inprogress_0000000000000000027 │ fsimage_0000000000000000024 │ ├── fsimage_0000000000000000024.md5 │ ├── fsimage_0000000000000000026 │ ├── fsimage_0000000000000000026.md5 │ ├── seen_txid VERSION in_use.lock

The functions of each file directory are as follows:

1 、 current

It mainly stores the meta-information and logs of the stored data of hdfs.

(1) edits file

Is a binary file, which mainly records the information about the addition, deletion and modification of hdfs, similar to the binary log of MySQL. Edits_inprogress_xxxxx indicates that it is the latest edits log and is currently in use.

You can use the command to view the contents of the edits file:

/ format: hdfs oev-I input file-o output file (xml format) [root@bigdata121 current] # hdfs oev-I edits_inprogress_0000000000000000038-o / tmp/edits_ inprogess.xml [root @ bigdata121 current] # cat / tmp/edits_inprogess.xml-63 OP_START_LOG_SEGMENT indicates the category of the operation, which indicates that the log starts to record 38 ID similar to the operation. Is the only OP_ADD_BLOCK / / like this means the operation of uploading files 34 / jdk-8u144-linux-x64.tar.gz._COPYING_ 1073741825 134217728 1001 1073741826 01002-2 (2) fsimage files

The metadata file for the data in the hdfs. Records information about individual blocks in the hdfs file system, but it is not up to date. Edits files need to be merged here regularly to be up-to-date. You can use the command to view the contents of the fsimage file:

/ / format: hdfs oiv-p output format-I input file-o output file [root@bigdata121 current] # hdfs oiv-p XML-I fsimage_0000000000000000037-o / tmp/fsimage37.xml [root@bigdata121 current] # cat / tmp/fsimage37.xml-63117e75c2a11685af3e043aa5e604dc831e5b14674178093053510001002010737418263716387316385DIRECTORY1558145602785root:supergroup:07559223372036854775807-116386DIRECTORYinput1558105166840root:supergroup:0755-1-116387FILEjdk-8u144-linux-x64.tar.gz215581456027531558145588521134217728root:supergroup:06441073741825100113421772810737418261002512981140001638516386163870000100

The recorded information is more detailed. The metadata of the file, such as file permissions, timestamps, and so on, is recorded.

(3) seen_txid

Txid is a concept similar to event id, which refers to an identity for each operation. What is recorded in this file is the next of the latest txid, that is, the current last txid is 37, then the file records 38.

(4) the relationship between fsimage and edit file naming.

Edits file:

We can see that the edits files are named in the edits_00000xxx-000000xxx way, which actually means that the scope of the txid operation event is recorded in the edits file. Edit_inprogess_00000xxx represents the latest txid event currently logged.

Fsimage file:

Named as fsimage_000000xxx, it represents the latest txid event logged to the fsimage file. Note that the edits file will not be merged into fsimage until fsimage is conditionally triggered, otherwise it will not be merged. So in general, the txid behind the edits file must be larger than fsimage.

2 、 in_use.lock

This file is mainly used to lock the current node to prevent the current machine from starting multiple namenode at the same time. Only one namenode can be started

III. HDFS-DataNode

The data node of HDFS is mainly the block file that stores the data. A dfs/data directory is created under the specified directory and take a look at the directory structure:

[root@bigdata122 dfs] # tree datadata ├── current │ ├── BP-1130553825-192.168.50.121-1557922928723 │ │ ├── current │ ├── │ └── subdir0 │ │ └── subdir0 │ │ ├── blk_1073741825 │ │ ├── blk_1073741825_1001.meta │ │ ├── blk_1073741826 │ │ └── blk_1073741826_1002.meta │ ├── rbw │ └── VERSION │ │ ├── scanner.cursor │ │ └── tmp │ VERSION in_use.lock

In HDFS, files are divided into blocks of the same size for storage.

The default block size in hadoop 1.x is 64m.

The default block size in hadoop 2.x is 128m.

The multi-copy mode storage is no longer used in hadoop 3.x, but the erasure code technology is adopted.

Please see https://www.cnblogs.com/basenet855x/p/7889994.html

The blk_xxxxx files in the above directory are actually blk files, each of which is the specified block size.

IV. HDFS-SecondaryNameNode

The auxiliary daemon used to monitor the HDFS status is mainly the work of the auxiliary NameNode node (not the backup node of the NameNode), in which the main job is to merge the edits file into the fsimage file.

1. Merge edits files into fsimage according to the time interval of checkpoint (default is 3600 seconds) or the trigger condition under which edits files arrive at 64m.

2. After edits is merged into fsimage, the edits file can be emptied.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.