How to install hadoop True distributed Cluster in linux system 05/12 Update SLTechnology News&Howtos

How to install hadoop True distributed Cluster in linux system

2025-05-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)05/31 Report--

Most people do not understand the knowledge points of this article "how to install hadoop true distributed cluster in linux system", so the editor summarizes the following content, detailed content, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this "how to install hadoop true distributed cluster in linux system" article.

Hadoop true distributed full cluster installation, based on version 2.7.2 installation, installs Hadoop's master and slave nodes on two Linux machines, respectively.

1. Installation instructions

The user name of the installation needs to be consistent regardless of whether it is a NameNode or DataNode node.

The difference between master and slave lies only in the configured hostname

The machine represented by the hostname configured in the slaves of config is slave

It is OK not to use the hostname, just configure it as IP.

Under this cluster, you need to create a namenode path in the master node

And use the format command hdfs namenode-format.

Then create a datanode path in the slave node, noting the permissions of the directory.

two。 Configure hosts

If it already exists, it is not necessary, and each machine does the same operation.

10.43.156.193 zdh293 ywmaster/fish master10.43.156.194 zdh294 ywmaster/fish slave3. Create a user

The user names on the cluster must all be the same, otherwise the Hadoop cluster startup cannot be affected.

To add the same user to each machine, refer to the following command:

Useradd ywmaster4. Install JDK

Jdk1.7 is installed here

Scp yuwen@10.43.156.193:/home/yuwen/backup/jdk-7u80-linux-x64.tar.gz .zdh223tar-zxvf jdk-7u80-linux-x64.tar.gz vi .bash _ profile export JAVA_HOME=~/jdk1.7.0_80export PATH=$JAVA_HOME/bin:$PATHexport CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarsource .bash _ profile

Verify jdk

Java-version5. Set up cluster secret-free login 5.1. Set local secret-free login ssh-keygen-t dsa-P''- f ~ / .ssh/id_dsacat ~ / .ssh/id_dsa.pub > > ~ / .ssh/authorized_keys

The permission must be modified, otherwise you can't log in secretly.

Chmod 600 ~ / .ssh/authorized_keys

Verify secret-free login

Ssh localhost5.2. Set up remote secret-free login

Need to put the public key of this machine to the other party's machine authorized_keys in order to avoid secret login to other machines.

Enter the .ssh directory of ywmaster

Scp / .ssh/authorized_keys ywmaster@10.43.156.194:~/.ssh/authorized_keys_from_zdh293

Enter the .ssh directory of ywslave and pay attention to the backup, otherwise there is a duplicate ywmaster public key in the following steps.

Cat authorized_keys_from_zdh293 > > authorized_keysssh zdh2945.3. Set up secret-free login for other machines

Refer to the steps above to set up other machines in the same way. After configuration, zdh293 can log in without secret.

Scp /. Ssh/authorized_keys ywmaster@10.43.156.193:~/.ssh/authorized_keys_from_zdh2946. Install Hadoop

Upload and extract hadoop files

Scp pub@10.43.156.193:/home/pub/hadoop/source/hadoop-2.7.2-src/hadoop-dist/target/hadoop-2.7.2.tar.gz .zdh2234tar-zxvf hadoop-2.7.2.tar.gz7. Configure the environment variable export HADOOP_HOME=~/hadoop-2.7.2export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATHexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

Configure aliases to quickly access the configuration path

Alias conf='cd / home/ywmaster/hadoop-2.7.2/etc/hadoop'8. Review and modify Hadoop configuration file 8.1 hadoop-env.sh

Involving environment variables: JAVA_HOME,HADOOP_HOME,HADOOP_CONF_DIR

8.2 yarn-env.sh

Involving environment variables: JAVA_HOME,HADOOP_YARN_USER,HADOOP_YARN_HOME, YARN_CONF_DIR

8.3 slaves

This file saves all slave nodes, comments out localhost, and adds zdh294 as a slave node.

8.4 core-site.xmlfs.defaultFShdfs://10.43.156.193:29080fs.default.namehdfs://10.43.156.193:29080 io.file.buffer.size131072hadoop.tmp.dirfile:/home/ywmaster/tmp8.5 hdfs-site.xmldfs.namenode.rpc-address10.43.156.193:29080dfs.namenode.http-address10.43.156.193:20070 dfs.namenode.secondary.http-address10.43.156.193:29001dfs.namenode.name.dirfile:/home/ywmaster/ Dfs/namedfs.datanode.data.dirfile:/home/ywmaster/dfs/datadfs.replication1dfs.webhdfs.enabledtrue8.6 mapred-site.xmlmapreduce.framework.nameyarnmapreduce.shuffle.port23562 mapreduce.jobhistory.address10.43.156.193:20020mapreduce.jobhistory.webapp.address10.43.156.193:298888.7:yarn-site.xmlyarn.nodemanager.aux-servicesmapreduce_shuffle yarn.nodemanager.aux-services.mapreduce .shuffle.class TODODELETEorg.apache.hadoop.mapred.ShuffleHandler#mapreduce.shuffle is out of date Change to mapreduce_shuffleyarn.nodemanager.aux-services.mapreduce_shuffle.classorg.apache.hadoop.mapred.ShuffleHandleryarn.resourcemanager.address10.43.156.193:28032yarn.resourcemanager.scheduler.address10.43.156.193:28030yarn.resourcemanager.resource-tracker.address10.43.156.193:28031yarn.resourcemanager.admin.address10.43.156.193:28033yarn.resourcemanager.webapp.address10.43.156.193:280888.8 to get the default configuration file of Hadoop

Select the corresponding version of hadoop, download and decompress, and search for * .xml

Find core-default.xml,hdfs-default.xml,mapred-default.xml.

These are the default configurations. You can refer to the description of these configurations.

Modify these default configurations to configure your own Hadoop cluster.

Find. -name *-default.xml./hadoop-2.7.1/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml./hadoop-2.7.1/share/doc/hadoop/hadoop-project-dist/hadoop-common/core-default.xml./hadoop-2.7.1/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml./hadoop-2.7.1/share/doc/hadoop/ Hadoop-yarn/hadoop-yarn-common/yarn-default.xml./hadoop-2.7.1/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/classes/httpfs-default.xml9. Copy the configured Hadoop to another node scp-r ~ / hadoop-2.7.2 ywmaster@10.43.156.194:~/

Or just copy the configuration file, which can improve the copy efficiency:

Scp-r ~ / hadoop-2.7.2/etc/hadoop ywmaster@10.43.156.194:~/hadoop-2.7.2/etc

Create name and data data directories

Mkdir-p. / dfs/namemkdir-p. / dfs/data10. Start the verification Hadoop

Format namenode:

Hdfs namenode-format

The following results indicate success:

16-09-13 23:57:16 INFO common.Storage: Storage directory / home/ywmaster/dfs/name has been successfully formatted.

Start hdfs

Start-dfs.sh

Start yarn:

Start-yarn.sh

Note that after modifying the configuration, be sure to re-copy to other nodes, otherwise there will be problems with startup.

11. Check the startup result

Executing jps under NameNode should include the following processes:

15951 ResourceManager13294 SecondaryNameNode12531 NameNode16228 Jps

Executing jps under DataNode should include the following processes:

3713 NodeManager1329 DataNode3907 Jps

View the HDFS service:

Http://10.43.156.193:20070

View SecondaryNameNode:

Http://10.43.156.193:29001/

Refer to hdfs-site.xml for specific IP and Port:

Dfs.namenode.http-address The address and the base port where the dfs namenode web ui will listen on.

View RM:

Http://10.43.156.193:28088

Refer to yarn-site.xml for specific IP and Port:

Yarn.resourcemanager.webapp.address10.43.156.193:2808812. Other references

Stop the command:

Stop-yarn.shstop-dfs.sh

Execute command verification:

Hadoop fs-ls / usrhadoop fs-mkdir usr/yuwenhadoop fs-copyFromLocal wordcount / userhadoop fs-rm-r / user/wordresulthadoop jar ~ / hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount / user/wordcount.txt / user/wordresult_001hadoop fs-text / user/wordresult_001/part-r-00000 is the content of this article on "how to install hadoop True distributed Cluster on linux system". I believe we all have some understanding. I hope the content shared by the editor will be helpful to all of you. If you want to know more about the relevant knowledge, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.