In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
Editor to share with you how to install CentOS 7 Hadoop2.7, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!
The general idea is to prepare the master-slave server, configure the master server to log in to the slave server without password SSH, decompress and install JDK, decompress and install Hadoop, and configure hdfs, mapreduce and other master-slave relationships.
1. Environment, 3 CentOS7, 64-bit, Hadoop2.7 requires 64-bit Linux,CentOS7 Minimal ISO file is only 600m, the operating system can be installed in more than ten minutes
Master 192.168.0.182
Slave1 192.168.0.183
Slave2 192.168.0.184
2. SSH password-free login, because Hadoop needs to log in to each node through SSH to operate. I use root users, and each server generates a public key, which is then merged into authorized_keys.
(1) CentOS does not start ssh secret login by default. Remove the comment on 2 lines of / etc/ssh/sshd_config and set it on each server.
# RSAAuthentication yes
# PubkeyAuthentication yes
(2) enter the command, ssh-keygen-t rsa, generate key, do not enter a password, and press enter all the time. / root will generate a .ssh folder, which needs to be set on each server.
(3) merge the public key into the authorized_keys file, on the Master server, enter the / root/.ssh directory, and merge through the SSH command
Cat id_rsa.pub > > authorized_keys
Ssh root@192.168.0.183 cat ~ / .ssh/id_rsa.pub > > authorized_keys
Ssh root@192.168.0.184 cat ~ / .ssh/id_rsa.pub > > authorized_keys
(4) copy the authorized_keys and known_hosts of Master server to the / root/.ssh directory of Slave server
(5) after completion, ssh root@192.168.0.183 and ssh root@192.168.0.184 do not need to enter passwords.
3. JDK7 is required to install JDK,Hadoop2.7. Since my CentOS installation is minimized, there is no OpenJDK. Just extract the downloaded JDK and configure the variables.
(1) download "jdk-7u79-linux-x64.gz" and put it in the / home/java directory
(2) decompress, enter the command, tar-zxvf jdk-7u79-linux-x64.gz
(3) Editing / etc/profile
Export JAVA_HOME=/home/java/jdk1.7.0_79
Export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
Export PATH=$PATH:$JAVA_HOME/bin
(4) to make the configuration effective, enter the command, source / etc/profile
(5) enter the command, java-version, complete
4. Install Hadoop2.7, extract it from the Master server, and then copy it to the Slave server.
(1) download "hadoop-2.7.0.tar.gz" and put it in the / home/hadoop directory
(2) decompress, enter the command, tar-xzvf hadoop-2.7.0.tar.gz
(3) create a folder for data storage under the / home/hadoop directory, tmp, hdfs, hdfs/data, hdfs/name
5. Configure the core-site.xml under the / home/hadoop/hadoop-2.7.0/etc/hadoop directory
Fs.defaultFS
Hdfs://192.168.0.182:9000
Hadoop.tmp.dir
File:/home/hadoop/tmp
Io.file.buffer.size
131702
6. Configure the hdfs-site.xml under the / home/hadoop/hadoop-2.7.0/etc/hadoop directory
Dfs.namenode.name.dir
File:/home/hadoop/dfs/name
Dfs.datanode.data.dir
File:/home/hadoop/dfs/data
Dfs.replication
two
Dfs.namenode.secondary.http-address
192.168.0.182:9001
Dfs.webhdfs.enabled
True
7. Configure the mapred-site.xml under the / home/hadoop/hadoop-2.7.0/etc/hadoop directory
Mapreduce.framework.name
Yarn
Mapreduce.jobhistory.address
192.168.0.182:10020
Mapreduce.jobhistory.webapp.address
192.168.0.182:19888
8. Configure the mapred-site.xml under the / home/hadoop/hadoop-2.7.0/etc/hadoop directory
Yarn.nodemanager.aux-services
Mapreduce_shuffle
Yarn.nodemanager.auxservices.mapreduce.shuffle.class
Org.apache.hadoop.mapred.ShuffleHandler
Yarn.resourcemanager.address
192.168.0.182:8032
Yarn.resourcemanager.scheduler.address
192.168.0.182:8030
Yarn.resourcemanager.resource-tracker.address
192.168.0.182:8031
Yarn.resourcemanager.admin.address
192.168.0.182:8033
Yarn.resourcemanager.webapp.address
192.168.0.182:8088
Yarn.nodemanager.resource.memory-mb
seven hundred and sixty eight
9. Configure the JAVA_HOME of hadoop-env.sh and yarn-env.sh under the / home/hadoop/hadoop-2.7.0/etc/hadoop directory. If you do not set it, you will not be able to start it.
Export JAVA_HOME=/home/java/jdk1.7.0_79
10. Configure the slaves under the / home/hadoop/hadoop-2.7.0/etc/hadoop directory, delete the default localhost, and add 2 slave nodes
192.168.0.183
192.168.0.184
11. Copy the configured Hadoop to the corresponding location of each node and transmit it through scp
Scp-r / home/hadoop 192.168.0.183:/home/
Scp-r / home/hadoop 192.168.0.184:/home/
12. Start hadoop on the Master server, and the slave node will start automatically and enter the / home/hadoop/hadoop-2.7.0 directory
(1) initialize, enter command, bin/hdfs namenode-format
(2) start sbin/start-all.sh all, or separate sbin/start-dfs.sh and sbin/start-yarn.sh
(3) if you stop, enter the command, sbin/stop-all.sh
(4) enter the command, jps, and you can see the relevant information
13. To access Web, open the port or turn off the firewall directly.
(1) enter the command, systemctl stop firewalld.service
(2) Open http://192.168.0.182:8088/ in browser
(3) Open http://192.168.0.182:50070/ in browser
14. Installation completed. This is only the beginning of big data's application, and then the work is to write a program to call the interface of Hadoop and play the role of hdfs and mapreduce according to his own situation.
=
Hadoop is a distributed system infrastructure that enables users to develop distributed programs without knowing the underlying details of the distribution.
The important core of Hadoop: HDFS and MapReduce. HDFS is responsible for storage and MapReduce is responsible for computing.
Here are the key points for installing Hadoop:
In fact, it is not troublesome to install Hadoop, mainly need the following advance conditions, if the following advance conditions are done, it is very easy to start according to the official website configuration.
1. The running environment of Java, and the distribution of Sun is recommended.
2. SSH public key authentication
The above environment is settled, and all that is left is the configuration of Hadoop. Different versions of this configuration may be different. Please refer to the official documentation for details.
Environment
Virtual machine: VMWare10.0.1 build-1379776
Operating system: CentOS7 64 bit
Install the Java environment
Download address: http://www.Oracle.com/technetwork/cn/java/javase/downloads/jdk8-downloads-2133151-zhs.html
Select the appropriate download package according to your operating system version. If you support the rpm package, download rpm directly or use the rpm address.
Rpm-ivh http://download.oracle.com/otn-pub/java/jdk/8u20-b26/jdk-8u20-linux-x64.rpm
JDK is constantly updated, so to install the latest version of JDK, you need to go to the official website to get the rpm address of the latest installation package.
Configure SSH public key non-secret authentication
CentOS comes with openssh-server, openssh-clients and rsync by default. If you don't have it on your system, please find your own installation method.
Create a joint account
Create hadoop (self-defined name) accounts on all machines, and set the password to hadoop
Useradd-d / home/hadoop-s / usr/bin/bash-g wheel hadoop
Passwd hadoop
SSH configuration
Vim / etc/ssh/sshd_config
Find the following three configuration items and change them to the following settings. If commented, remove the previous # uncomment to make the configuration effective.
RSAAuthentication yes
PubkeyAuthentication yes
# The default is to check both .ssh / authorized_keys and .ssh / authorized_keys2
# but this is overridden so installations will only check .ssh / authorized_keys
AuthorizedKeysFile .ssh / authorized_keys
.ssh / authorized_keys is the path where the public key is stored.
Key public key generation
Log in with your hadoop account.
Cd ~
Ssh-keygen-t rsa-P''
Save the generated ~ / .ssh/id_rsa.pub file as ~ / .ssh/authorized_keys
Cp / .ssh/id_rsa.pub ~ / .ssh/authorized_keys
Use the scp command to copy the .ssh directory to other machines and lazily make all machines have the same key and share the public key.
Scp / .ssh/* hadoop@slave1:~/.ssh/
Make sure that the access to ~ / .ssh/id_rsa must be 600, which forbids other users to access it.
The above is all the contents of the article "how to install Hadoop2.7 in CentOS 7". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.