How to build spark1.5.2 Cluster on Hadoop2.6.0 07/19 Update SLTechnology News&Howtos

How to build spark1.5.2 Cluster on Hadoop2.6.0

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to build a spark1.5.2 cluster on Hadoop2.6.0. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

I. prerequisites for Spark installation

I need to install the Hadoop cluster before installing Spark, because I already installed hadoop, so I installed spark directly on the previous hadoop cluster, but because there was not enough memory on the machine, I only chose master and slave01 to install the spark cluster instead of slave02.

2. Spark installation steps:

1. Download scala-2.11.7.tgz

Http://www.scala-lang.org/download/2.11.7.html

two。 Download spark-1.5.2-bin-hadoop2.6.tgz (previously installed hadoop is 2.6.0)

Http://www.apache.org/dyn/closer.lua/spark/spark-1.5.2/spark-1.5.2-bin-hadoop2.6.tgz

3. Install Scala (on master):

Mkdir / application/scala

Cp / root/scala-2.11.7.tgz / application/scala/

Cd / application/scala/

Tar-zxvf scala-2.11.7.tgz

Create a soft link:

Ln-s / application/scala/scala-2.11.7 / application/scala/scala

Modify the environment variable, add SCALA_HOME, and modify PATH:

Vi / etc/profile.d/java.sh

Export SCALA_HOME=/application/scala/scala-2.11.7

Export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$SCALA_HOME/bin:$PATH

Make the configuration effective immediately:

Source / etc/profile

Verify that the installation is successful

Scala-version

The display is as follows:

4. Copy / application/scala from master to another machine, slave01.

Scp-r / application/scala root@slave01:/application/

5. Copy / etc/profile.d/java.sh to slave01 as well.

Then perform the following command on slave01 to make the configuration effective:

Source / etc/profile

6. Install Spark (on master):

Mkdir / application/spark

Cp / root/spark-1.5.2-bin-hadoop2.6.tgz / application/spark/

Tar-zxvf spark-1.5.2-bin-hadoop2.6.tgz

Modify the environment variable: add SPARK_HOME and modify PATH.

Vi / etc/profile.d/java.sh

Export SPARK_HOME=/application/spark/spark-1.5.2-bin-hadoop2.6

Export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/bin:$PATH

The configuration takes effect immediately:

Source / etc/profile

7. Modify the configuration file

7.1 modify the spark-env.sh configuration file:

Cd / application/spark/spark-1.5.2-bin-hadoop2.6/conf

Cp spark-env.sh.template spark-env.sh

Vi spark-env.sh

Append at the end

# jdk direxport JAVA_HOME=/usr/local/jdk###scala direxport SCALA_HOME=/application/scala/scala###the ip of master node of sparkexport SPARK_MASTER_IP=192.168.10.1###the max memory size of workerexport SPARK_WORKER_MEMORY=512m###hadoop configuration file direxport HADOOP_CONF_DIR=/application/hadoop/hadoop/etc/hadoop

7.2Modification of slaves files

Cp slaves.template slaves

Vi slaves

Add the following (there may be a default localhost, change it to master):

Master

Slave01

8. Copy the configuration file of / application/spark and environment variables to slave01, and use the source command to make the file effective immediately

Scp-r / application/spark root@slave01:/application/

Scp-r / etc/profile.d/java.sh root@slave01:/etc/profile.d/java.sh

Modify groups and users

Chown-R hadoop:hadoop / application/spark

9. At this point, the Spark cluster has been built.

10. Start the Spark cluster:

Before starting Spark, you need to start the dfs and yarn of hadoop.

/ application/spark/spark-1.5.2-bin-hadoop2.6/sbin/start-all.sh

After starting all the services, enter jps on the command line and display as follows:

More Master and worker than when the hadoop cluster was started

Enter the following command

/ application/spark/spark-1.5.2-bin-hadoop2.6/bin/spark-shell.sh

When scala > appears, it is successful.

When you type 192.168.10.1 Worker 8080 in the browser, you will see that there are two VLANs as shown below.

Enter 192.168.10.1 purl 4040 in the browser

Appears as shown in the figure:

3. Run the wordcount instance:

Scala > var textcount=sc.textFile ("hdfs://master:9000/data/words2") .filter (line= > line.contains ("")) .count ()

The display results are as follows:

This is the end of the article on "how to build spark1.5.2 clusters on Hadoop2.6.0". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.