Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How does HDFS Namenode work

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "how HDFS Namenode works". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

How does HDFS Namenode work?

HDFS clients perform file system raw data operations through a single server node called Namenode, while DataNode communicates with other DataNode and replicates data blocks to achieve redundancy, so that a single DataNode corruption does not result in data loss in the cluster.

But the loss of NameNode failure is intolerable. The main responsibility of NameNode is to track how the file is divided into file blocks, which nodes the file blocks are stored, and whether the overall running state of the distributed file system is normal. However, if the NameNode node stops running, the data node will not be able to communicate, and the client will not be able to read and write data to HDFS. In fact, this will also cause the whole system to stop working.

The HDFS Namenode is a single point of failure (SPOF)

Facebook is also well aware of the seriousness of the problems brought about by "Namenode-as-SPOF", so Facebook hopes to build a system to eliminate the hidden dangers brought by "Namenode-as-SPOF". But before we learn about this system, let's take a look at what problems Facebook has encountered in using and deploying HDFS.

Use of Facebook data Warehouse

HDFS clusters are deployed in Facebook's data warehouse, which is used by traditional Hadoop MapReduce workloads-running MapReduce batch jobs in a small number of large clusters

Because the cluster is very large, the client and many DataNode nodes transmit huge amounts of raw data to NameNode nodes, which leads to a very heavy load on NameNode. The pressure from CPU, memory, disk and network also makes the high load of NameNode in the data warehouse cluster common. In the course of use, Facebook found that the failure caused by HDFS in its data warehouse accounted for 41% of the total failure rate.

HDFS NameNode is not only an important part of HDFS but also an important part of the whole data warehouse. Although the highly available NameNode can only prevent 10% of the unplanned downtime of the data warehouse, eliminating NameNode is a major victory for SPOF because it allows Facebook to perform hardware and software responses to subscriptions. In fact, Facebook estimates that 50% of the planned downtime of the cluster can be eliminated if NameNode is resolved.

So what does high availability NameNode look like? How will it work? Let's take a look at the chart of high availability NameNode.

In this structure, clients can communicate with Primary NameNode and Standby NameNode, as well as many DataNode

Also have the ability to send block reports to Primary NameNode and Standby NameNode. In essence, the AvatarNode developed by Facebook is a solution with high availability of NameNode.

Avatarnode: a solution with NameNode failover

To address the design flaw of a single NameNode node, Facebook started working internally with AvatarNode about two years ago.

At the same time, AvatarNode provides high-availability NameNode and hot failover and rollback capabilities, and Facebook has contributed AvatarNode to the open source community. After numerous tests and Bug fixes, AvatarNode is now running stably in Facebook*** 's Hadoop data warehouse. Thanks in large part to Dmytro Molkov, an engineer at Facebook.

In the event of a failure, the two highly available NameNode nodes of the AvatarNode can manually fail over. AvatarNode packages existing NameNode code and places it in the Zookeeper layer.

The basic concepts of AvatarNode are as follows:

With Primary NameNode and Standby NameNode

The current Master hostname is saved in ZooKeeper

Improved DataNode sends block reports to Primary NameNode and Standby NameNode

The improved HDFS client will check the Zookeeper before everything starts, and if it fails, it will transfer to another transaction. At the same time, if AvatarNode failover occurs during the write process, AvatarNode's mechanism will allow complete data writes to be guaranteed.

Some people may be curious about the name of the Facebook solution, because Facebook's Hadoop engineer Dhruba Borthakur arrived at the time when James Cameron's "Avatar" movie hit. We should be thankful that we should call it TitanicNode if it were 1998.

AvatarNode has withstood the most demanding working environment within Facebook, and Facebook will continue to greatly improve the reliability of AvatarNode and the management of HDFS clusters in the future. And the integration with the general high availability framework will also achieve the characteristics of unattended, automation and security failover.

Facebook has hosted its own Hadoop and AvatarNode solutions to GitHub. Interested friends can download the research. Of course, not only Facebook is trying to solve the shortcomings of Hadoop, MapR and Cloudera products also have similar capabilities.

That's all for "how HDFS Namenode works". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report