The method of managing Hadoop 07/02 Update SLTechnology News&Howtos

The method of managing Hadoop

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "the method of managing Hadoop". In the daily operation, I believe that many people have doubts about the method of managing Hadoop. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "the method of managing Hadoop"! Next, please follow the editor to study!

HDFS permanent data structure

Directory structure of 1.namenode

Detailed explanation of VERSION file:

# Thu Dec 15 10:07:46 CST 2016namespaceID=1277563549clusterID=CID-a4ff16ba-4427-4f8a-bbaf-4665b3ce714bcTime=0storageType=NAME_NODEblockpoolID=BP-1697576408-127.0.0.1-1481767666542layoutVersion=-63layoutVersion-HDFS metadata version number, which is usually updated only when new features are added to HDFS. The unique identifier of the namespaceID file system namespace is the clusterID created when namenode is first formatted. BlockpoolID is the unique identifier given to the HDFS cluster as a whole. BlockpoolID is the unique identifier of the block pool The block pool contains all the files in the namespace managed by one namenode. CTime marks the creation time of the namenode storage system. For a newly formatted storage system, the attribute value is 0storageType. The storage directory contains the data structure of namenode.

File system images and editing logs

When a file system client performs write operations, such as creating or moving files, these transactions are first recorded in the edit log. Namenode maintains the metadata of the file system in memory; when the editing log is modified, the relevant metadata information is updated synchronously.

Each fsimage is a complete permanent checkpoint of the file system metadata.

How to solve the problem of slow restart of namenode

Run the secondary namenode, 1. Secondarynamenode requests namenode to stop using the edits_inprogress file, the new editing operation is recorded in a new edits_inprogress file, and namenode updates the seen_txid in all storage directories

2. Secondarynamenode gets the latest fsimage and edits files from namenode through HTTP GET

3. Secondarynamenode loads the fsimage file into memory, executes the transactions in the edits file one by one, and creates a new merged fsimage file

4. Secondarynamenode sends the new fsimage file back to the main namenode,namenode via HTTP PUT and saves it as a temporary .ckpt file

5. Namenode renames the temporary fsimage file

Directory structure of datanode

Safety mode

When namenode starts, it first loads the image file into memory and performs various operations in the editing log. Once the image of the file system metadata is successfully created in memory, a new fsimage file and an empty edit log are created. During this process, namenode runs in safe mode, meaning that the namenode file system is read-only to the client.

Log audit

Tools

Solve the problem of uneven distribution of blocks on each datanode

Dfsadmin

Fsck

Datanode block scanner, which periodically detects all blocks on this node

Equalizer

Monitor and control

The primary daemon needs to be monitored most, including the primary namenode, the secondary namenode and the resource manager

Journal

Metrics and JMX (Java Management extension)

The hadoop daemon collects information about events and metrics, collectively referred to as "metrics"

Maintain

Metadata backup

Data backup

Prioritize

Distcp is an ideal backup tool.

File system check

File system equalizer

Appointment and removal of nodes

It is not secure to allow any machine to connect to namenode as datanode.

To add a new node: 1. Add the network address of the new node to the include file. Run the following directive to update a series of audited datanode collections to the namenode information hdfs dfsadmin-refreshNodes 3. Run the following directive to update a series of audited node manager information to the resource manager yarn rmadmin-refreshNodes 4. Update the slaves file with the new node. In this case, the Hadoop control script will include the new node in future operations. 5. Start the new datanode and node manager 6. Check to see if the new datanode and node manager appear in the web interface

1. Add the network address of the node to be released to the exclude file without updating the include file

two。 Execute the following instructions to update the namenode settings with a new set of audited datanode

Hdfs dfsadmin-refreshNodes

3. Update resource manager settings with a new set of audited node managers

Yarn rmadmin-refreshNodes

4. Go to the web interface to see if the administrative state of the datanode to be removed has become undone, because the relevant datanode is in the process of being released at this time. These datanode will copy their blocks to other datanode

5. When the state of all datanode is released, it indicates that all the fast copies have been completed. Close the node that has been released.

6. Remove these nodes from the include file and run the following command

Hdfs dfsadmin-refreshNodes yarn rmadmin-refreshNodes

Remove nod

Appointment

Remove a node from a slaves file

Upgrade

Whether it can be rolled back

At this point, the study of "the method of managing Hadoop" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.