In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Introduction
In the previous article, the fifth part of big data's learning series-Hive Integration HBase Picture and text description: Hive is used to integrate HBase in http://www.panchengming.com/2017/12/18/pancm62/, and the test is successful. In the previous big data learning series, one of the Hadoop environment building (stand-alone): http://www.panchengming.com/2017/11/26/pancm55/ successfully built the Hadoop environment, this article is mainly about the Hadoop+Spark environment. Although the stand-alone version is built, it is quite easy to change to the cluster version, which will be written about Hadoop+Spark+HBase+Hive+Zookeeper and other clusters in the future.
1. Environment selection 1, server selection
Local virtual machine
Operating system: linux CentOS 7
Cpu:2 kernel
Memory: 2G
Hard disk: 40g
2, configuration selection
JDK:1.8 (jdk-8u144-linux-x64.tar.gz)
Hadoop:2.8.2 (hadoop-2.8.2.tar.gz)
Scala:2.12.2 (scala-2.12.2.tgz)
Spark: 1.6 (spark-1.6.3-bin-hadoop2.4-without-hive.tgz)
3, download address
Official website address:
JDK:
Http://www.oracle.com/technetwork/java/javase/downloads
Hadopp:
Http://www.apache.org/dyn/closer.cgi/hadoop/common
Spark:
Http://spark.apache.org/downloads.html
Hive on Spark (version of spark integrated hive)
Http://mirror.bit.edu.cn/apache/spark/
Scala:
Http://www.scala-lang.org/download
Baidu Cloud:
Link: https://pan.baidu.com/s/1geT3A8N password: f7jb
Second, the configuration of the server
Before you configure Hadoop+Spark integration, you should do a little configuration.
For convenience in doing these configurations, use root permissions.
1, change the hostname
Change the hostname first in order to facilitate management.
View the name of this machine
Enter:
Hostname
Change the native name
Enter:
Hostnamectl set-hostname master
Note: after the host name is changed, reboot will not take effect until it is restarted.
2. Mapping the relationship between the host and IP
Modify the hosts file to do relational mapping
Input
Vim / etc/hosts
Add
Ip and host name of the host
192.168.219.128 master3, turn off the firewall
Turn off the firewall to facilitate external access.
The following inputs for CentOS 7 version:
Turn off the firewall
Service iptables stop
Input for versions above CentOS 7:
Systemctl stop firewalld.service4, time settin
Enter:
Date
Check whether the server time is the same, and if not, change it.
Change time command
Date-s' MMDDhhmmYYYY.ss' III. Scala environment configuration
Because the configuration of Spark depends on Scala, you need to configure Scala first.
Configuration of Scala
1, file preparation
Extract the downloaded Scala file
Input
Tar-xvf scala-2.12.2.tgz
Then move to / opt/scala
And renamed it scala2.1.
Input
Mv scala-2.12.2 / opt/scalamv scala-2.12.2 scala2.12, environment configuration
Edit / etc/profile file
Enter:
Export SCALA_HOME=/opt/scala/scala2.1export PATH=.:$ {JAVA_HOME} / bin:$ {SCALA_HOME} / bin:$PATH
Enter:
Source / etc/profile
Make the configuration effective
Enter scala-version to check whether the installation is successful
III. Spark environment configuration 1, file preparation
There are two kinds of Spark, the download address is given, one is a pure version of spark, and the other is an integrated version of hadoop and hive. The second is used in this article.
Extract the downloaded Spark file
Input
Tar-xvf spark-1.6.3-bin-hadoop2.4-without-hive.tgz
Then move to / opt/spark and rename it
Input
Mv spark-1.6.3-bin-hadoop2.4-without-hive / opt/sparkmv spark-1.6.3-bin-hadoop2.4-without-hive spark1.6-hadoop2.4-hive
2. Environment configuration
Edit / etc/profile file
Enter:
Export SPARK_HOME=/opt/spark/spark1.6-hadoop2.4-hive export PATH=.:$ {JAVA_HOME} / bin:$ {SCALA_HOME} / bin:$ {SPARK_HOME} / bin:$PATH
Enter:
Source / etc/profile
Make the configuration effective
3, change the configuration file
Switch directories
Enter:
Cd / opt/spark/spark1.6-hadoop2.4-hive/conf4.3.1 modify spark-env.sh
In the conf directory, modify the spark-env.sh file, and if it is not spark-env.sh, copy the spark-env.sh.template file and rename it to spark-env.sh.
Modify the newly created spark-env.sh file and add the configuration:
Export SCALA_HOME=/opt/scala/scala2.1 export JAVA_HOME=/opt/java/jdk1.8export HADOOP_HOME=/opt/hadoop/hadoop2.8 export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export SPARK_HOME=/opt/spark/spark1.6-hadoop2.4-hiveexport SPARK_MASTER_IP=master export SPARK_EXECUTOR_MEMORY=1G
Note: the above path is based on your own, SPARK_MASTER_IP is the host, and SPARK_EXECUTOR_MEMORY is the set running memory.
V. configuration of Hadoop environment
The specific configuration of Hadoop is described in detail in one of big data's learning series-Hadoop Environment Building (stand-alone): http://www.panchengming.com/2017/11/26/pancm55. So this article will give a general introduction.
Note: the specific configuration is based on your own.
1, environment variable setting
Edit the / etc/profile file:
Vim / etc/profile
Configuration file:
Export HADOOP_HOME=/opt/hadoop/hadoop2.8export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS= "- Djava.library.path=$HADOOP_HOME/lib" export PATH=.:$ {JAVA_HOME} / bin:$ {HADOOP_HOME} / bin:$PATH2, configuration file change
Change to the / home/hadoop/hadoop2.8/etc/hadoop/ directory first
5.2.1 modify core-site.xml
Enter:
Vim core-site.xml
Before adding:
Hadoop.tmp.dir / root/hadoop/tmp Abase for other temporary directories. Fs.default.name hdfs://master:9000 5.2.2 modify hadoop-env.sh
Enter:
Vim hadoop-env.sh
Modify ${JAVA_HOME} to your own JDK path
Export JAVA_HOME=$ {JAVA_HOME}
Modified to:
Export JAVA_HOME=/home/java/jdk1.85.2.3 modifies hdfs-site.xml
Enter:
Vim hdfs-site.xml
Before adding:
Dfs.name.dir / root/hadoop/dfs/name Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently. Dfs.data.dir / root/hadoop/dfs/data Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks. Dfs.replication 2 dfs.permissions false need not permissions5.2.4 modify mapred-site.xml
If the file is not mapred-site.xml, copy the mapred-site.xml.template file and rename it to mapred-site.xml.
Enter:
Vim mapred-site.xml
Modify the newly created mapred-site.xml file and add the configuration to the node:
Mapred.job.tracker master:9001 mapred.local.dir / root/hadoop/var mapreduce.framework.name yarn3,Hadoop startup
Note: it is not necessary if it has been successfully configured.
You need to format before starting.
Change to / home/hadoop/hadoop2.8/bin directory
Enter:
. / hadoop namenode-format
After the format is successful, switch to the / home/hadoop/hadoop2.8/sbin directory
Start hdfs and yarn
Enter:
Start-dfs.shstart-yarn.sh
After the startup is successful, enter jsp to check whether the startup is successful
Enter ip+8088 and ip+ 50070 in the browser to see if you can access it.
If you can access it correctly, you can start it successfully.
VI. Spark start
Start spark to make sure that hadoop has been started successfully
First use the jps command to view the started program
After successfully starting spark, use the jps command to view
Change to the Spark directory
Enter:
Cd / opt/spark/spark1.6-hadoop2.4-hive/sbin
Then start Spark
Enter:
Start-all.sh
Then enter it in the browser
Http://192.168.219.128:8080/
If the interface is displayed correctly, the startup will be successful.
Note: if spark starts successfully but cannot access the interface, first check whether the firewall is off, and then use jps to view the process. If there is no problem, you can generally access the interface. If it still doesn't work, check the configuration of hadoop, scala, spark.
So this is the end of this article, thank you for reading!
If you feel good, you can click like or recommend.
Copyright notice:
Author: nothingness
Source of blog Park: http://www.cnblogs.com/xuwujing
Source of CSDN: http://blog.csdn.net/qazwsxpcm
Source of personal blog: http://www.panchengming.com
Original is not easy, reprint please indicate the source, thank you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.