Hadoop Secondarynamenode principle 04/27 Update SLTechnology News&Howtos

Hadoop Secondarynamenode principle

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

File storage of namenode

The namenode data store is divided into two files, the fsp_w_picpath file and the edits file, and the edits file records all namenode operations, which is equivalent to logging. Fsp_w_picpath records the data of namenode. When namenode starts, the data of fsp_w_picpath is loaded into memory, and all the data information is parsed into memory from the edits file. The two data are merged to form the full information of namenode.

The role of secondarynamenode

Secondarynamenode merges edits files and fsp_w_picpath files according to certain rules. After merging, namenode will enable new edits files, which will reduce the file size of edits files, and controlling the size of edits files will reduce the time it takes for namenode to parse and load edits files during startup.

Secondarynamenode merge file rules

Configure fs.checkpoint.period to perform checkpoint merge file check time default 3600s

Fs.checkpoint.size implements checkpoint merge file threshold size defaults to 64m

One of the two conditions meets the merged file.

Schematic diagram of working principle

Architecture analysis

What is the difference between fsp_w_picpath and edits files for namenode storage data, and why separate two files for storage?

Fsp_w_picpath stores the serialization information of all directories and files, while edits saves all write or update information, and only writes relevant operation information and file information to the edits file during namenode operation.

It is stored in two files because fsp_w_picpath saves all namenode information, so the file size is usually large, so writing in a large file costs system resources and delays system reaction time, while edits files are usually smaller than fsp_w_picpath due to secondarynamenode merging, so updating write operations in edits files will reduce the consumption of system resources.

Why did you introduce sencondarynamenode, and what's the problem with just using namenode?

Because namenode saves files separately, but cannot make edits files too large, file merging is needed, but file merging will take up resources such as system memory. If namenode is used directly for file merging, it will lead to the decline of system file management ability during the file merging period. In addition, because secondarynamenode is separated from namenode, namenode and secondarynamenode can be deployed separately to different machines to improve the stability and security of the system. In addition, due to the checkpoint of secondarynamenode, secondarynamenode can recover system data on the checkpoint when the namenode is completely down, and of course, it will also cause data loss after the checkpoint.

-Shi Longgang

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.