In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
1. Why build HA?
before hadoop2.x, NameNode had a single point of failure (SPOF:A Single Point of Failure) in the HDFS cluster. For a cluster with only one NameNode, if the NameNode machine fails (such as downtime or software or hardware upgrade), the entire cluster will not be available. You will have to wait until NameNode is restarted before providing services. This method is absolutely not allowed in the generation environment.
HA of HDFS: solve the above problem by configuring two NameNodes of Active/Standby to achieve hot backup of NameNode in the cluster. If there is a failure, such as the machine crashes or the machine needs to be upgraded and maintained, the NameNode can be quickly switched to another machine in this way.
2. How does HA work?
Explanation (data consistency and persistence issues):
When the hdfs client connects to the namenode and the namenode starts, the edtis log (operation logging) is submitted to the JN cluster. At this point, in the JN cluster, when half of the cluster services successfully receive the message and return it to namenode, it means that the edits log is uploaded successfully. At this point, NameNode (standBy) needs to retrieve the metadata information from the JN cluster, then integrate the new fsimage, and then push it back to NameNode (active). When NameNode (standBy) requests data from JN, it will first check whether there are any downtime machines in JN, using more than half mechanism (when some of the servers in the cluster are down, the cluster will take most of the available machines as primary, and all other machines will be out of service). To provide external data transmission to the standby (standBy). When resolving block block information, datanode sends block block information to both namenode.
Explanation (to solve the problem of switching between master and slave): at this time, you need to use another cluster-zookeeper,zookeeper cluster is a highly available cluster. Its implementation mechanism is: first, a server provides external services, and other machines are standby machines. When the host goes down, the voting mechanism (logical clock, ID number, and data update degree) is adopted in the zookeeper cluster to select a new master. At this point, the ZKFC service control thread grabs the zookeeper cluster with one hand and the NameNode with the other. Grasp the service control of NameNode (active), monitor the status of NameNode (active), and report to the zookeeper cluster in real time. Live in the service control of NameNode (standBy) and receive the information sent by the zookeeper cluster. If the message sent indicates that the service of the master NameNode has stopped, call the callback function in the service control immediately to make NameNode (standBy) become the host and continue to provide services. The function of is the same as that of keeplive. The difference in using zookeeper is that if an abnormal exit of a process occurs at the service control, the slave machine will be changed to active status through the zookeeper cluster, and the host may still be active, resulting in the NameNode of two active. In zookeeper, it is solved like this: when a service control process exits abnormally, it will have an invisible hand to connect to another NameNode, and within a controllable range, turn it into an active, and turn the NameNode that he cannot control into a standby. 3. How to build a HA cluster?
Preparation before setting up a cluster: https://blog.51cto.com/14048416/2341450
Construction of zookeeper Cluster: https://blog.51cto.com/14048416/2336178
1) Cluster planning
2) specific installation steps: 1) upload the installation package hadoop-2.6.5-centos-6.7.tar.gz2) decompress to the corresponding installation directory
[hadoop@hadoop01] $tar-zxvf hadoop-2.6.5-centos-6.7.tar.gz-C / home/hadoop/apps/
3) modify the configuration file
Hadoo-env.sh:
Join: export JAVA_HOME= / usr/local/jdk1.8.0_73
Core-site.xml:
Fs.defaultFShdfs://myha01/hadoop.tmp.dir/home/hadoop/data/hadoopdata/ha.zookeeper.quorumhadoop01:2181,hadoop02:2181,hadoop03:2181
Hdfs-site.xml:
Dfs.replication 3 dfs.nameservices myha01 dfs.ha.namenodes.myha01 nn1,nn2 dfs.namenode.rpc-address.myha01.nn1 hadoop01:9000 dfs.namenode.http-address.myha01.nn1 hadoop01:50070 dfs.namenode.rpc-address.myha01.nn2 hadoop02:9000 dfs.namenode.http-address.myha01.nn2 hadoop02:50070 dfs.namenode.shared.edits.dirqjournal://hadoop01:8485;hadoop02:8485 Hadoop03:8485/myha01 dfs.journalnode.edits.dir / home/hadoop/data/journaldata dfs.ha.automatic-failover.enabled true dfs.client.failover.proxy.provider.myha01org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence shell (/ bin/true) dfs.ha.fencing.ssh.private-key-files / home/hadoop/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 30000
Mapred-site.xml:
Mapreduce.framework.name yarnmapreduce.jobhistory.addresshadoop02:10020mapreduce.jobhistory.webapp.addresshadoop02:19888
Yarn-site.xml:
Yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 hadoop01 yarn.resourcemanager.hostname.rm2 hadoop02 yarn.resourcemanager.zk-address hadoop01:2181,hadoop02:2181,hadoop03:2181 yarn.nodemanager.aux-services mapreduce_shuffle yarn.log-aggregation-enable true yarn.log-aggregation.retain-seconds 86400 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.classorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
Slaves:
Hadoop01hadoop02hadoop034) distribute installation packages to other machines
[hadoop@hadoop01 apps] $scp-r hadoop-2.6.5 hadoop@hadoop02:$PWD
[hadoop@hadoop01 apps] $scp-r hadoop-2.6.5 hadoop@hadoop03:$PWD
5) configure environment variables separately
[hadoop@hadoop01 apps] $vi ~ / .bashrc
Add two lines:
Export HADOOP_HOME=/home/hadoop/apps/hadoop-2.6.5export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
[hadoop@hadoop01 apps] $source ~ / .bashrc
6) Cluster initialization operation
Start the zookeeper cluster first:
Launch: zkServer.sh start
Check whether the startup is normal: zkServer.sh status
Start the journalnode process:
[hadoop@hadoop01 ~] $hadoop-daemon.sh start journalnode
[hadoop@hadoop02 ~] $hadoop-daemon.sh start journalnode
[hadoop@hadoop03 ~] $hadoop-daemon.sh start journalnode
Then use the jps command to see if the journalnode process is started on each datanode node
Perform a format operation on the first namenode:
[hadoop@hadoop01 ~] $hadoop namenode-format
Then some cluster information will be generated in the temporary directory configured in core-site.xml and copied to the same directory of the second namenode.
Hadoop.tmp.dir
/ home/hadoop/data/hadoopdata/
Two namenode nodes. The data structure in this directory is consistent.
[hadoop@hadoop01] $scp-r ~ / data/hadoopdata/ hadoop03:~/data
Or on another namenode node: hadoop namenode-bootstrapStandby
Format ZKFC (you can format it on a cluster):
[hadoop@hadoop01 ~] $hdfs zkfc-formatZK
Start HDFS:
[hadoop@hadoop01 ~] $start-dfs.sh
Start YARN:
[hadoop@hadoop01 ~] $start-yarn.sh
If the resourcemanager of the standby node is not started, start it manually:
[hadoop@hadoop02 ~] $yarn-daemon.sh start resourcemanager
7) add:
View the status of each master node
HDFS:
Hdfs haadmin-getServiceState nn1
Hdfs haadmin-getServiceState nn2
YARN:
Yarn rmadmin-getServiceState rm1
Yarn rmadmin-getServiceState rm2
4. Post-cluster testing of HA
1. Manually kill the namenode of active to see the status of the cluster
two。 Manually kill the resourcemanager of active to see the status of the cluster
3. When uploading files, kill namenode to check the status of the cluster
4. When performing a task, kill resourcemanager and check the cluster status.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.