How to configure namenode ha in hadoop2.0 04/30 Update SLTechnology News&Howtos

How to configure namenode ha in hadoop2.0

2025-04-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article will explain in detail how to configure namenode ha in hadoop2.0. The editor thinks it is very practical, so I share it with you for reference. I hope you can get something after reading this article.

The main problem of ha in hdfs is the synchronization of metadata information between active and standby. The previous solutions include avatar and so on. Shared storage can use related storage such as NFS,bookkeeper. Here we use Journal to achieve shared storage, mainly because of the simple configuration.

Virtual machine preparation: three, listed below:

Machine name

Function

Master1

Namenode (active), JournalNode,zookeeper

192.168.6.171

Master2

Namenode,JournalNode,zookeeper

192.168.6.172

Datanode1

Datanode,JournalNode,zookeeper

192.168.6.173

Software version: hadoop 2.4.1 zookeeper3.4.6

After downloading hadoop2.4.1, extract the zookeeper.

The first step is to configure the zookeeper cluster

Rename the zoo_sample.cfg under the conf under the decompressed folder of zookeeper to zoo.cfg

Modify configuration

DataDir=/cw/zookeeper/ I changed it to / cw/zookeeper/ here to make sure the folder exists

Add a cluster configuration at the end of the file

Server.1=192.168.6.171:2888:3888

Server.2=192.168.6.172:2888:3888

Server.3=192.168.6.173:2888:3888

Distribute the modified zookeeper folder to the other two machines

Scp-r zookeeper-3.4.6 root@192.168.6.172:/cw/

Scp-r zookeeper-3.4.6 root@192.168.6.173:/cw/

Configure pid for each machine

Execute on 192.168.6.171 machine

Echo "1" > > / cw/zookeeper/myid

Execute on the 192.168.6.172 machine

Echo "2" > > / cw/zookeeper/myid

Execute on the 192.168.6.173 machine

Echo "3" > > / cw/zookeeper/myid

Start zookeeper, and each machine executes

. / zkServer.sh start

After all starts, you can check the log to confirm whether to start OK, or execute. / zkServer.sh status to see the status of each node.

-Huali Division hadoop begins -configure parameters related to hadoop

Hadoop-env.sh mainly configures the path of java_home

The configuration of core-site.xml is as follows

Fs.defaultFS

Hdfs://myhadoop

Myhadoop is the id of namespace

Io.file.buffer.size

131072

Ha.zookeeper.quorum

192.168.6.171192.168.6.172192.168.6.173

Modify hdfs-site.xml

Dfs.nameservices

Myhadoop corresponds to the previous namespace

Comma-separated list of nameservices.

As same as fs.defaultFS in core-site.xml.

Dfs.ha.namenodes.myhadoop

Nn1,nn2 id number of each nn

The prefix for a given nameservice, contains a comma-separated

List of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).

Dfs.namenode.rpc-address.myhadoop.nn1

192.168.6.171:8020

RPC address for nomenode1 of hadoop-test

Dfs.namenode.rpc-address.myhadoop.nn2

192.168.6.172:8020

RPC address for nomenode2 of hadoop-test

Dfs.namenode.http-address.myhadoop.nn1

192.168.6.171:50070

The address and the base port where the dfs namenode1 web ui will listen on.

Dfs.namenode.http-address.myhadoop.nn2

192.168.6.172:50070

The address and the base port where the dfs namenode2 web ui will listen on.

Dfs.namenode.servicerpc-address.myhadoop.n1

192.168.6.171:53310

Dfs.namenode.servicerpc-address.myhadoop.n2

192.168.6.172:53310

The lower part is divided into the corresponding file storage directory configuration.

Dfs.namenode.name.dir

File:///cw/hadoop/name

Determines where on the local filesystem the DFS name node

Should store the name table (fsimage). If this is a comma-delimited list

Of directories then the name table is replicated in all of the

Directories, for redundancy.

Dfs.namenode.shared.edits.dir

Qjournal://192.168.6.171:8485;192.168.6.172:8485;192.168.6.173:8485/hadoop-journal

A directory on shared storage between the multiple namenodes

In an HA cluster. This directory will be written by the active and read

By the standby in order to keep the namespaces synchronized. This directory

Does not need to be listed in dfs.namenode.edits.dir above. It should be

Left empty in a non-HA cluster.

Dfs.datanode.data.dir

File:///cw/hadoop/data

Determines where on the local filesystem an DFS data node

Should store its blocks. If this is a comma-delimited

List of directories, then data will be stored in all named

Directories, typically on different devices.

Directories that do not exist are ignored.

Dfs.ha.automatic-failover.enabled

True

Whether automatic failover is enabled. See the HDFS High

Availability documentation for details on automatic HA

Configuration.

Dfs.journalnode.edits.dir

/ cw/hadoop/journal/

Dfs.client.failover.proxy.provider.myhadoop

Org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

Dfs.ha.fencing.methods

Sshfence

Dfs.ha.fencing.ssh.private-key-files

/ home/yarn/.ssh/id_rsa

The location stored ssh key

Dfs.ha.fencing.ssh.connect-timeout

one thousand

Dfs.namenode.handler.count

eight

The folders involved above need to be created manually, and an exception will occur if they do not exist.

After the configuration is completed, the configured hadoop is distributed to all cluster nodes. At the same time, each node establishes the required folder.

Start formatting the zk node by executing:. / hdfs zkfc-formatZK

After the execution is completed, start ZookeeperFailoverController to monitor the status of the master and slave nodes.

. / hadoop-daemon.sh start zkfc is usually started on the master / slave node.

The next step is to start the shared storage system JournalNode

Start on each JN node: hadoop-daemon.sh start journalnode

Next, execute. / hdfs namenode-format to format the file system on the main NN

Start the main NN./hadoop-daemon.sh start namenode after execution

Synchronize the NN metadata information on the standby NN node first, and execute. / hdfs namenode-bootstrapStandby

After the synchronization is complete, start the standby NN. / hadoop-daemon.sh start namenode

Since zk has automatically selected a node as the master node, do not set it manually here. If you want to set the master / standby NN manually, you can execute it.

. / hdfs haadmin-transitionToActive nn1

Start all datanode

Open 192.168.6.171rig 50070 and 192.168.6.172purl 50070 respectively.

You can execute the relevant hdfs shell commands to verify that the cluster is working properly.

Next, kill drops the NN of the primary node.

Kill-9 135415

You can see that the switch has been successful.

This is the end of the article on "how to configure namenode ha in hadoop2.0". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.