How to install and configure Spark under centOS7 05/12 Update SLTechnology News&Howtos

How to install and configure Spark under centOS7

2025-05-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "how to install and configure Spark under centOS7". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Environment description:

Operating system: centos7 64-bit 3 sets

Centos7-1 192.168.190.130 master

Centos7-2 192.168.190.129 slave1

Centos7-3 192.168.190.131 slave2

Installing spark requires the following to be installed at the same time:

Jdk scale

1. Install jdk and configure jdk environment variables

Here is not about how to install and configure jdk, own Baidu.

two。 Install scala

Download the scala installation package, select the version that meets the requirements, and upload it to the server using the client tool. Decompress:

# tar-zxvf scala-2.13.0-m4.tgz modify the / etc/profile file again to add the following: export scala_home=$work_space/scala-2.13.0-m4 export path=$path:$scala_home/bin # source / etc/profile / / make it effective immediately # scala- version / / check whether the installation of scala is complete

3. Install spark

Description: there are different versions of the package to download, choose the download and installation you need.

Source code: spark source code, which needs to be compiled to use. In addition, scala 2.11 needs to be compiled with source code before it can be used.

Pre-build with user-provided hadoop: "hadoop free" version, which can be applied to any hadoop version

Pre-build for hadoop 2.7and later: based on the pre-compiled version of hadoop 2.7, it needs to correspond to the version of hadoop installed locally. Hadoop 2.6 is also optional. Because the hadoop installed here is 3.1.0, I directly install the version of for hadoop 2.7 and later.

Note: for the installation of hadoop, please check my previous blog. I will not repeat the description.

Install and configure spark under centos7 # mkdir spark# cd / usr/spark#tar-zxvf spark-2.3.1-bin-hadoop2.7.tgz#vim / etc/profile# add spark environment variables, such as under path, export out # source / etc/profile# and enter the conf directory Make a copy of spark-env.sh.template and rename it spark-env.sh#cd / usr/spark/spark-2.3.1-bin-hadoop2.7/conf#cp spark-env.sh.template spark-env.sh#vim spark-env.shexport scala_home=/usr/scala/scala-2.13.0-m4export java_home=/usr/lib/jvm/jre-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.x86_64export hadoop_ Home=/usr/hadoop/hadoop-3.1.0export hadoop_conf_dir=$hadoop_home/etc/hadoopexport spark_home=/usr/spark/spark-2.3.1-bin-hadoop2.7export spark_master_ip=masterexport spark_executor_memory=1g# enters the conf directory Copy a copy of slaves.template to be renamed slaves#cd / usr/spark/spark-2.3.1-bin-hadoop2.7/conf#cp slaves.template slaves#vim slaves# add node domain name to slaves file # master / / the domain name is centos7-1 # slave1 / / the domain name is centos7-2 # slave2 / / the domain name is centos7-3

Start spark

# start the hadoop node before starting spark # cd / usr/hadoop/hadoop-3.1.0/#sbin/start-all.sh#jps / / check whether the started thread has started the hadoop # cd / usr/spark/spark-2.3.1-bin-hadoop2.7#sbin/start-all.sh Note: spark must also be installed on the slave1\ slave2 node as above Or copy a copy directly to the slave1,slave2 node # scp-r / usr/spark root@slave1ip:/usr/spark

The startup information is as follows:

Starting org.apache.spark.deploy.master.master, logging to / usr/spark/logs/spark-root-org.apache.spark.deploy.master.master-1-master.out

Slave2: starting org.apache.spark.deploy.worker.worker, logging to / usr/spark/logs/spark-root-org.apache.spark.deploy.worker.worker-1-slave2.com.cn.out

Slave1: starting org.apache.spark.deploy.worker.worker, logging to / usr/spark/logs/spark-root-org.apache.spark.deploy.worker.worker-1-slave1.com.cn.out

Master: starting org.apache.spark.deploy.worker.worker, logging to / usr/spark/logs/spark-root-org.apache.spark.deploy.worker.worker-1-master.out

Test the spark cluster:

Open the spark cluster on the master node with a browser

This is the end of the content of "how to install and configure Spark under centOS7". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.