How to manage upgrade in HDFS 07/19 Update SLTechnology News&Howtos

How to manage upgrade in HDFS

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "HDFS how to upgrade management", the content is easy to understand, clear, hope to help you solve doubts, the following let the editor lead you to study and learn "HDFS how to upgrade management" this article.

Summary procedures and commands for upgrading HDFS

In the official document of Hadoop, the upgrade proposal for HDFS is divided into three steps: 1, stop the HDFS service, then start, HDFS merges FsEditLog into FsImage, and then stops the HDFS service. 2, back up the meta file of namenode. In the configuration file of the new version HDFS installation directory, configure the meta file directory of namenode to point to the old meta file directory, start HDFS with-upgrade option, and let HDFS service perform the upgrade process. 3, after the upgrade progress reaches 100%. Execute hadoop dfsadmin-finlizeUpgrade to tell the HDFS service that the upgrade is over. If the upgrade fails, launch HDFS with the-rollback option and perform a rollback.

From the upgrade to its own mission, the purpose of the upgrade is to protect the existing data. If during the upgrade process, the data is lost due to the upgrade, or the entire HDFS file system is damaged, then the upgrade fails. After reading the upgrade process and commands of the Hadoop official documentation, I wonder whether such a simple description can guarantee that the upgrade process must "achieve" the goal. Or at least guarantee that the HDFS filesystem itself will not be damaged!

Doubt is doubt, wait until you understand it, and then draw a conclusion about it! However, HDFS upgrade information is very little, here from the source code analysis, and strive to achieve the closest to the author's intention to upgrade the design.

Storage structure of NameNode

In the HDFS file system, one of the basic responsibilities of NameNode is to allocate blocks to the datanode; to tell the client which machine the contents of a block can be read from. Block information is very important for the HDFS file system. In order not to lose block information and enable the client side, datanode can quickly understand how to retrieve block information. When designing the storage structure of NameNode, HDFS developers draw lessons from some methods of database WAH, but at the same time adopt a very unusual approach.

In order to ensure that the block information is not lost, when HDFS allocates a block, it first writes the log of the block information to the file system of the disk, which is stored in HDFS called fsEditLog. After the log is successfully written, the block information is added to a Map in JVM heap that can store duplicate HASH value entries. If the block operation is deleted, the corresponding entry is deleted from this Map. This Map holds all the block information in the HDFS, rather than the usual cache part of the data. In HDFS's namenode storage, the amount of block information stored on disk is the same as the actual amount of block information stored in Map. In order to ensure that the data stored in this Map does not exceed the JVM Heap limit, when Namenode starts, 2% of the default JVM-Xx startup parameter value is used to calculate the maximum number of entries that the Map can hold. When the entries in the Map reach this maximum, no new blocks can be added. On the disk side, the block information is mainly stored in fsImage, and the block operation log is stored in fsEditLog. Since the last transaction point in the fsImage file, the operation on the block is first sequentially appended to fsEditLog instead of directly written to fsImage. When the HDFS service starts, it starts a checkpointer component that periodically merges the current fsImage and fsEditLog into the new fsImage.

The internal physical structure of NameNode's meta is shown in the following figure:

From the perspective of storage, the storage structure of HDFS can only carry human block information from fsImage to Map on memory side when HDFS is started, or write Map on memory side to disk side when needed. For operations such as adding, deleting and changing blocks, the memory side and the disk side are independent, and the log can ensure that the final block information is consistent at both ends. The advantage of this design is that it is simple and easy to implement, while the disadvantage is that the size of the memory side is limited and not easy to expand.

Storage structure of DataNode

Compared with NameNode for the storage structure of blocks, the storage structure for blocks on DataNode is much simpler, and DataNode only stores pairs. After a block is written to DataNode, a meta file with the same name as the block is written, storing the block size and the time it took to generate the block. In DataNode, the life cycle of the block is divided in detail. When a new block is appended to the file, the block is located in the RBW directory (1.x is in the BlockBeingWritten directory). When the block reaches the system-specified block size or the file is closed, the block becomes finallized, and DataNode moves the block to the current/finallized directory.

In HDFS 2.x, the storage of data blocks is managed according to BlockPoolSlice, and logically, a BlockPoolSlice represents a directory in the DataNode configuration item dfs.datanode.data.dir. New blocks are created on DataNode, and blocks are moved and deleted through BlockPoolSlice to complete logical operations. The physical structure of the BlockPoolSlice is shown in the following figure:

For the upgrade of NameNode meta files, if NameNode starts with the upgrade command option, the current directory in the meta folder will be upgraded. Now, assuming that HDFS has just been restarted and no new blocks are added to the HDFS, the fsEditLog is empty. Follow the steps below for this directory.

If there is a previous directory in the meta folder, delete it

Rename current as previous.tmp

Create a new current directory

Upgrade the VERSION file under the previous.tmp file to the currently installed version and write it to the current directory

Manned previous.tmp files under the fsImage into the files and blocks to memory, the loading process is backward compatible, the loaded files and blocks have been converted to the current installation version format.

The genStamp, clusterId, and numFiles information in the old version of fsImage remains unchanged, but the old version of fsImage file name and fsEditLog file may change.

After the fsImage is loaded successfully, write the files and blocks on the memory side to the disk's current directory according to the latest format, and at this time, tell the system that the meta file upgrade operation of NameNode is complete!

From this process, you can see that during the upgrade process, NameNode does not operate directly on the old versions of fsImage and fsEditLog, but creates a new folder, which is very safe. First of all, half of the questions at the beginning are solved.

Upgrade of DataNode Stora

The process of DataNode upgrade is similar to that of NameNode's fsImage upgrade, but there are great differences in the specific operation. Data blocks are mainly stored on DataNode. If you copy data blocks to a new directory, then copy PB,EB-level data, this process will become unthinkable. The masters have found a suitable solution for the upgrade of DataNode.

For a physical folder of BlockPoolSlice, upgrade performs the following steps

If there is previous in the physical folder of BlockPoolSlice, delete it

Rename the current directory to previous.tmp

Create the current/ {bpid} / current directory in the DataNode file, where bpid is blockpool ID, a new term introduced in HDFS version 2.x to manage blockpool slice

Traverse all the blocks in the previous.tmp file, establish a hard connection for them, and save them to the current/ {bpid} / current directory. During this process, the name of the block file may change depending on the version span. Recall the upgrade of the block information in NameNode. Because the file name of the block in the block information is mainly a generated timestamp and the serial number of the file, when changing the block file name, as long as NameNode and DataNode change according to the same rules, after the upgrade, it can still be consistent with HDFS at the user level before the upgrade.

Rename the previous.tmp directory to previous, and tell HDFS at this time that the block pool slice upgrade is complete

Hard connection is an advanced feature of some file systems. There are two kinds of Node designed in some file systems, one is INode, which stores the index information of the file, and the other is DataBlock, which stores the real data of the file. The name of the file is stored in INode. INode also uses a pointer to point to the first DataBlock of the file. Establishing a hard connection is essentially a new INode for the data block of the file. This INode is also the first block that points to this file. The file name contained in this INode can be different from the original name of the file. Of course, the original file name is actually the hard connection that stores the file's DataBlock. On such a file system, deleting a file only deletes the file's INode. If a file's DataBlock does not have an INode pointing to it, the DataBlock can be reclaimed by the file system. As long as there is another INode pointing to these DataBlock, they will not be recycled. In such a file system, INode can be designed to be very small, such as Ext4, where an INode occupies 512 bytes and a DataBlock is at least 4K. When formatting a disk, the number of INode allocated by the system is about 1 DataBlock 8.

Therefore, the disk space required to upgrade DataNode by hard connection is much smaller. Logically, it will not cause damage to the file system.

Rollback and upgradeFinallized

If the upgrade process fails, you need to use the rollback command option to allow HDFS to roll back to the state before the upgrade. Since both versions of HDFS file system information exist at this time, the rollback process is actually a folder renaming process, as shown in the following steps:

Rename the current directory to remove.tmp

Rename the previous directory to current

Delete the remove.tmp directory

At this point, the details of the HDFS upgrade are about the same, and we are confident that HDFS can be upgraded successfully. Of course, it is not a problem to roll back safely. When the HDFS program correctly executes the steps of the upgrade, it still needs to be executed manually.

Hadoop dfsadmin-upgradeFinallized

Let HDFS delete the file before the upgrade, and after this command is executed, you will not be able to roll back to the previous version of HDFS.

The above is all the contents of the article "how to upgrade Management in HDFS". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.