In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly shows you "how to configure HDFS high availability environment in Hadoop framework", which is easy to understand and well organized. I hope it can help you solve your doubts. Let me lead you to study and learn this article "how to configure HDFS high availability environment in Hadoop framework".
1. High availability of HDFS 1. Basic description
In the case of a single node failure or a small number of node failures, the cluster can also provide services normally. The HDFS high availability mechanism can eliminate the problem of single node failure by configuring two NameNodes nodes of Active/Standby to achieve hot backup for NameNode in the cluster. If a single node fails, you can quickly switch the NameNode to another node.
2. Detailed explanation of the mechanism.
High availability based on two NameNode, dependent on shared Edits files and Zookeeper cluster
Each NameNode node is configured with a ZKfailover process, which is responsible for monitoring the status of the NameNode node.
NameNode maintains a persistent session with the ZooKeeper cluster
If the Active node fails, ZooKeeper notifies the NameNode node of the Standby status
After the ZKfailover process detects and confirms that the failed node is not working
ZKfailover notifies NameNode nodes in Standby status to switch to Active status to continue service
ZooKeeper is very important in big data system, coordinating the work of different components, maintaining and transferring data, for example, automatic failover depends on ZooKeeper components under high availability.
2. HDFS highly available 1. Overall configuration service list HDFS file YARN scheduling single service shared file Zk cluster hop01DataNodeNodeManagerNameNodeJournalNodeZK-hop01hop02DataNodeNodeManagerResourceManagerJournalNodeZK-hop02hop03DataNodeNodeManagerSecondaryNameNodeJournalNodeZK-hop032, configuration JournalNode
Create a directory
[root@hop01 opt] # mkdir hopHA
Copy the Hadoop directory
Cp-r / opt/hadoop2.7/ / opt/hopHA/
Configure core-site.xml
Fs.defaultFS hdfs://mycluster hadoop.tmp.dir / opt/hopHA/hadoop2.7/data/tmp
Configure hdfs-site.xml and add the following
Dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 hop01:9000 dfs.namenode.rpc-address.mycluster.nn2 hop02:9000 dfs.namenode.http-address.mycluster.nn1 hop01:50070 dfs.namenode.http-address.mycluster.nn2 hop02:50070 dfs.namenode.shared.edits.dir qjournal://hop01:8485;hop02:8485 Hop03:8485/mycluster dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files / root/.ssh/id_rsa dfs.journalnode.edits.dir / opt/hopHA/hadoop2.7/data/jn dfs.permissions.enable false dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
Start the journalnode service in turn
[root@hop01 hadoop2.7] # pwd/opt/hopHA/hadoop2.7 [root@hop01 hadoop2.7] # sbin/hadoop-daemon.sh start journalnode
Delete data under hopHA
[root@hop01 hadoop2.7] # rm-rf data/ logs/
NN1 formats and starts NameNode
[root@hop01 hadoop2.7] # pwd/opt/hopHA/hadoop2.7bin/hdfs namenode-formatsbin/hadoop-daemon.sh start namenode
NN2 synchronizes NN1 data
[root@hop02 hadoop2.7] # bin/hdfs namenode-bootstrapStandby
NN2 starts NameNode
[root@hop02 hadoop2.7] # sbin/hadoop-daemon.sh start namenode
View current status
Start all DataNode on NN1
[root@hop01 hadoop2.7] # sbin/hadoop-daemons.sh start datanode
NN1 switches to Active state
[root@hop01 hadoop2.7] # bin/hdfs haadmin-transitionToActive nn1 [root@hop01 hadoop2.7] # bin/hdfs haadmin-getServiceState nn1active
3. Failover configuration
Configure hdfs-site.xml. Add the following content: synchronize the cluster
Dfs.ha.automatic-failover.enabled true
Configure core-site.xml. Add the following content: synchronize the cluster
Ha.zookeeper.quorum hop01:2181,hop02:2181,hop03:2181
Shut down all HDFS services
[root@hop01 hadoop2.7] # sbin/stop-dfs.sh
Start the Zookeeper cluster
/ opt/zookeeper3.4/bin/zkServer.sh start
Hop01 initializes the state of HA in Zookeeper
[root@hop01 hadoop2.7] # bin/hdfs zkfc-formatZK
Hop01 starts the HDFS service
[root@hop01 hadoop2.7] # sbin/start-dfs.sh
NameNode node starts ZKFailover
Here, the service status that hop01 and hop02 start first is Active, and here hop02 is started first.
[hadoop2.7] # sbin/hadoop-daemon.sh start zkfc
End the NameNode process of hop02
Kill-9 14422
Wait a minute to check the hop01 status
[root@hop01 hadoop2.7] # bin/hdfs haadmin-getServiceState nn1active III, YARN High availability 1, basic description
The basic flow and ideas are similar to the HDFS mechanism, relying on Zookeeper clusters. When the Active node fails, the Standby node will switch to Active state continuous service.
2. Detailed explanation of configuration
The environment is also demonstrated based on hop01 and hop02.
Configure yarn-site.xml to synchronize services under the cluster
Yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id cluster-yarn01 yarn.resourcemanager.ha.rm-ids rm1 Rm2 yarn.resourcemanager.hostname.rm1 hop01 yarn.resourcemanager.hostname.rm2 hop02 yarn.resourcemanager.zk-address hop01:2181,hop02:2181,hop03:2181 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
Restart the journalnode node
Sbin/hadoop-daemon.sh start journalnode
Format and start in the NN1 service
[root@hop01 hadoop2.7] # bin/hdfs namenode-format [root@hop01 hadoop2.7] # sbin/hadoop-daemon.sh start namenode
Synchronize NN1 metadata on NN2
[root@hop02 hadoop2.7] # bin/hdfs namenode-bootstrapStandby
Start DataNode under the cluster
[root@hop01 hadoop2.7] # sbin/hadoop-daemons.sh start datanode
NN1 is set to Active statu
Start hop01 first, and then start hop02.
[root@hop01 hadoop2.7] # sbin/hadoop-daemon.sh start zkfc
Hop01 starts yarn
[root@hop01 hadoop2.7] # sbin/start-yarn.sh
Hop02 starts ResourceManager
[root@hop02 hadoop2.7] # sbin/yarn-daemon.sh start resourcemanager
View statu
[root@hop01 hadoop2.7] # bin/yarn rmadmin-getServiceState rm1
These are all the contents of the article "how to configure HDFS High availability Environment in Hadoop Framework". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.