What is the solution of Namenode single point failure in Hadoop and the principle of AvatarNode? 07/11 Update SLTechnology News&Howtos

What is the solution of Namenode single point failure in Hadoop and the principle of AvatarNode?

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article introduces the solution of Namenode single point fault in Hadoop and the principle of AvatarNode. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.

As we all know, NameNode has a single point of failure in Hadoop systems, which has always been a weakness for Hadoop, which boasts high availability. The editor discusses several solution that exist to solve this problem.

1. Secondary NameNode

Principle: Secondary NN periodically reads editlog from NN and merges with its own stored Image to form a new metadata image

Advantages: earlier versions of Hadoop are built-in, easy to configure, and basically require no additional resources (machines can be shared with datanode)

Disadvantages: slow recovery time, some data will be lost

2. Backup NameNode

Principle: backup NN gets editlog in real time. When NN goes down, switch to Backup NN manually.

Advantages: starting from hadoop0.21 to provide this solution, there will be no data loss

Disadvantages: because you need to get the location information of Block from DataNode, it is slow to switch to Backup NN (depending on the amount of data)

3. Avatar NameNode

Principle: this is a HA scheme provided by Facebook. The editlog of client accessing hadoop is placed in NFS, and Standby NN can get the information of editlog;DataNode and Active NN and Standby NN report block in real time.

Advantages: information will not be lost, fast recovery (seconds)

Cons: Facebook is based on Hadoop0.2 and is slightly cumbersome to deploy; with additional machine resources required, NFS becomes another single point (although the failure rate is low)

4. Hadoop2.0 directly supports StandBy NN, draws lessons from Facebook's Avatar, and then makes some improvements.

Advantages: information will not be lost, fast recovery (seconds), easy deployment

- -

AvatarNode, one of the solutions to Hadoop NameNode single point problem, is introduced in detail.

Demand:

Implement the backup of Namenode metadata to solve the problem that the cluster is unavailable due to a single downtime of namenode.

Scenario description:

When the namenode server goes down, we can use the metadata backed up by namenode to quickly reconstruct the new namenode and put it into use.

1. Hadoop itself provides a solution that can use the backup data of secondarynamenode to recover the metadata of namenode. However, due to the problem of checkpoint (secondarynamenode merges and synchronizes namenode data every time checkpoint), the backup data of secondarynamenode cannot be kept synchronized with namenode at all times, that is, secondarynamenode may lose data for a period of time when NameOde goes down, which depends on the cycle of checkpoint. We can reduce the cycle of checkpoint to reduce the amount of data loss, but because each checkpoint consumes a lot of performance, and this scheme can not fundamentally solve the problem of data loss. So if this kind of data loss is not allowed on the demand, this kind of scheme can not be considered directly.

2. Another solution provided by Hadoop is NFS, which instantly backs up Namenode metadata. It sets multiple data directories (including NFS directories) so that namenode can write to multiple directories at the same time while holding persistent metadata. The advantage of this solution over the first scheme is that it can avoid data loss (here we will not discuss the possibility that NFS itself will lose data, after all, the probability is very small). Since the problem of data loss can be solved, it shows that this scheme is feasible in principle.

Download the source code

Https://github.com/facebook/hadoop-20

Deployment environment

4 machines

Hadoop1-192.168.64.41 AvatarNode (primary)

Hadoop2-192.168.64.42 AvataDataNode

Hadoop3-192.168.64.43 AvataDataNode

Hadoop4- 192.168.64.67 AvatarNode (standby)

Related resources and description

The following is a brief introduction to Avatar scenario deployment.

1. First of all, about the Avatar scheme, the backup of Hadoop is a single point of backup to Dfs, and does not include Mapred, because Hadoop itself does not have a mechanism to deal with jobtracker single point of failure.

2.AvatarNode inherits from Namenode, not a modification of Namenode, and so does AvatarDataNode. Therefore, the startup mechanism of Avatar is independent of Hadoop itself.

3. In the Avatar scenario, the responsibility of SecondaryNamenode is included in the Standby node, so there is no need to start a SecondaryNamenode independently.

4.AvatarNode must be supported by NFS to share transaction logs (editlog) between two nodes.

For the time being, the automatic switching between Primary and Standby can not be realized in the Avatar source code provided by 5.FB. The automatic switching can be realized with the help of Zookeeper's lease mechanism.

Switching between 6.Primary and Standby only includes switching from Standby to Primary, and switching from Primary state to Standby state is not supported.

7.AvatarDataNode does not use VIP and AvatarNode to communicate, but communicates directly with Primary and Standby, so it is necessary to use VIP drift scheme to shield the IP transformation problem in the switching process between the two nodes. With regard to the integration with Zookeeper, officials say it will be released in a later version.

For a more detailed introduction to AvatarNode, please refer to http://blog.csdn.net/rzhzhz/article/details/7235789

III. Compilation

1. First modify the build.xml under the hadoop root directory, commenting out 996 lines and 1000 lines. As follows:

two。 Type ant jar in the root directory (for compiling package can refer to the build.xml code) to compile hadoop, the compiled jar package will be in the build directory (hadoop- 0.20.3-dev-core.jar), copy the jar package to the hadoop root directory to replace the original jar (long word, hadoop startup will first load the class under the build directory, so when you modify the jar package by replacing class, please temporarily remove the build directory).

3. Go to the src/contrib/highavailability directory to compile Avatar, and the compiled jar package will be copied to the lib directory under the build/contrib/highavailability directory (hadoop-$ {version}-highavailability.jar).

4. Distribute the compiled jar package in step 2 and 3 to the corresponding directories of all machines in the cluster.

IV. Configuration

1. Configure hdfs-site.xml

Dfs.name.dir

/ data/hadoop/hdfs/name

Determineswhereon the local filesystem the DFS name node shouldstore the name table. Ifthis is a comma-delimited list ofdirectories then the name tableis replicated in all of thedirectories, for redundancy

Dfs.data.dir

/ data/hadoop/facebook_hadoop_data/hdfs/data

Dfs.datanode.address

0.0.0.0:50011

Default is 50010, which is the listening port of datanode

Dfs.datanode.http.address

0.0.0.0:50076

Default is 50075, which is the http server port of datanode

Dfs.datanode.ipc.address

0.0.0.0:50021

Default is 50020, which is the ipc server port of datanode

Dfs.http.address0

192.168.64.41:50070

Dfs.http.address1

192.168.64.67:50070

Dfs.name.dir.shared0

/ data/hadoop/share/shared0

Dfs.name.dir.shared1

/ data/hadoop/share/shared1

Dfs.name.edits.dir.shared0

/ data/hadoop/share/shared0

Dfs.name.edits.dir.shared1

/ data/hadoop/share/shared1

Dfs.replication

two

Defaultblock replication. The actual number of replicationscan bespecified when the file is created. The default isused ifreplicationis not specified in create time

Parameter description:

1) dfs.name.dir.shared0

AvatarNode (Primary) metadata storage directory, note that it cannot be the same as the dfs.name.dir directory

2) dfs.name.dir.shared1

AvatarNode (Standby) metadata storage directory, note that it cannot be the same as the dfs.name.dir directory

3) dfs.name.edits.dir.shared0

AvatarNode (Primary) edits file storage directory, which is the same as dfs.name.dir.shared0 by default

4) dfs.name.edits.dir.shared1

AvatarNode (Standby) edits file storage directory, which is the same as dfs.name.dir.shared1 by default

5) dfs.http.address0

Monitoring address of AvatarNode (Primary) HTTP

6) dfs.http.address1

Monitoring address of AvatarNode (Standby) HTTP

7) dfs.namenode.dn-address0/dfs.namenode.dn-address1

Although involved in the Avatar source code, it is not used for the time being

two。 Configure core-site.xml

Hadoop.tmp.dir

/ home/hadoop/tmp

A baseforother temporary directories.

Fs.default.name

Hdfs://192.168.64.41:9600

The name ofthedefault file system. Eitherthe literal string "local" or a host:port for DFS.

Fs.default.name0

Hdfs://192.168.64.41:9600

The name ofthedefault file system. Eitherthe literal string "local" or a host:port for DFS.

Fs.default.name1

Hdfs://192.168.64.67:9600

The name ofthedefault file system. Eitherthe literal string "local" or a host:port for DFS.

Parameter description:

1) fs.default.name

The current AvatarNode IP address and port number, that is, Primary and Standby, are configured with their respective IP address and port number.

2) fs.default.name0

AvatarNode (Primary) IP address and port number

3) fs.default.name1

AvatarNode (Standby) IP address and port number

3. Because mapred is not involved, mapred-site.xml does not need to be modified, but can be configured for the original cluster.

4. Distribute the modified configuration file to the cluster node and establish the corresponding directories in the configuration file on the Primary and Standby nodes.

5. Establish NFS to realize the data sharing between Primary and Standby shared0 directory. Please refer to http://blog.csdn.net/rzhzhz/article/details/7056732 for the configuration of NFS

6. Format Primary and Standby. Here you can use either hadoop's own format command or AvatarNode's format command (bin/hadooporg.apache.hadoop.hdfs.AvatarShell-format), but the shared1 directory cannot be empty at this time, which is a bit redundant. It is recommended to use the formatting command of hadoop itself to format on Primary, and copy the files in the name directory to the shared0 directory. Then copy the files in the shared0 directory on Standby to the shared1 directory.

V. start up

1. Since it does not involve a single point of jobtracker, we will only start the hdfs-related thread here. Primary,Standby two namenode (where Standby includes the responsibilities of SecondaryNamenode) and three AvatarDataNode data nodes.

two。 Start AvatarNode (Primary) under the root directory of Primary node hadoop

Bin/hadooporg.apache.hadoop.hdfs.server.namenode.AvatarNode-zero

3. Start AvatarNode (Standby) under the root directory of Standby node hadoop

Bin/hadooporg.apache.hadoop.hdfs.server.namenode.AvatarNode-one-standby

4. Start AvatarDataNode in the hadoop root directory of the data node

Bin/hadooporg.apache.hadoop.hdfs.server.datanode.AvatarDataNode

5. Other related commands

Bin/hadoop org.apache.hadoop.hdfs.server.namenode.AvatarNode. The following optional parameters are

# # View the status of current AvatarNode

1) bin/hadoop org.apache.hadoop.hdfs.AvatarShell-showAvatar

# # primary upgrades the current Standby node to the Primary node

2) bin/hadooporg.apache.hadoop.hdfs.AvatarShell-setAvatar

3) bin/hadooporg.apache.hadoop.hdfs.AvatarShell-setAvatar standby

Cluster testing

1. Access the web page of the cluster

(Primary) http://hadoop1-virtual-machine:50070

(Standby) http://hadoop5-virtual-machine:50070

It can be seen that all AvatarDataNode are registered to the two namenode,Primary in the normal state, while the Standby is in the Safemode state, which can only be read but not written. You can view the status of the current AvatarNode (Primary or Standby) through the AvatarShell command.

two。 Store the relevant data to the cluster, and the cluster works normally.

3. Kill the AvatartNode thread of the Primary node, and upgrade the current Prirmary to Prirmary in Standby. The data is not lost and the cluster is working normally (at this time, the web side cannot access the file system properly. You can view the cluster data through the shell command). However, due to the conversion limitation of Avatar, it can only be converted from Standby to Primary, so after a failure, the node rising from Standby to Primary can not be downgraded to Standby again, so it can not achieve free switching like Master/Slave.

On the Hadoop Namenode single point of failure solution and what the principle of AvatarNode is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.