Spark+scala installation configuration in docker 04/25 Update SLTechnology News&Howtos

Spark+scala installation configuration in docker

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

1. Scala installation

First download the scala package

Wget https://downloads.lightbend.com/scala/2.11.7/scala-2.11.7.tgz

Decompression

Tar-zxvf scala-2.11.7.tgz

Move directory

Mv scala-2.11.7 / usr/local/

Rename

Cd / usr/local/

Mv scala-2.11.7 scala

Configure environment variables

Vim / etc/profile

Export SCALA_HOME=/usr/local/scala

Export PATH=$PATH:$SCALA_HOME/bin

Environmental variables take effect

Source / etc/profile

View scala version

Scala-version

Distribute scala to other hosts

Scp-r / usr/local/scala/ root@Master:/usr/local/

Scp-r / usr/local/scala/ root@Slave2:/usr/local/

II. Spark installation

Copy the spark package to the container

Docker cp / root/spark-2.1.2-bin-hadoop2.4.tgz b0c77:/

View and decompress

Add spark environment variable to profile

Effective environment variable

Source / etc/profile

Edit spark-env.sh

Vim / usr/local/spark/conf/spark-env.sh

JAVA_HOME:Java installation directory SCALA_HOME:Scala installation directory HADOOP_HOME:hadoop installation directory configuration file directory of the HADOOP_CONF_DIR:hadoop cluster ip address of the Master node of the SPARK_MASTER_IP:spark cluster SPARK_WORKER_MEMORY: maximum amount of memory that each worker node can allocate to exectors spark _ WORKER_CORES: number of CPU cores per worker node SPARK_WORKER_INSTANCES: number of worker nodes open on each machine

Modify the slaves file

Cp slaves.template slaves

Vi conf/slaves

Scp-r / usr/local/spark/ Master:/usr/local

Scp-r / usr/local/spark/ Slave2:/usr/local

At the same time, the other two nodes also need to modify / etc/profile.

Start spark

. / sbin/start-all.sh

After successfully opening, using jps, you can see the newly opened Master and Worker processes on the Master, Slave1 and Slave2 nodes, respectively.

After successfully opening the Spark cluster, you can enter the WebUI interface of Spark. You can use the

SparkMaster_IP:8080

Port Mapping:

Iptables-t nat-A DOCKER-p tcp-- dport 8080-j DNAT-- to-destination 172.17.0.2 DOCKER 8080

At this point, we can access it through the port mapped to the host, and we can see that there are two Worker nodes running.

Open Spark-shell

Use

Spark-shell

The command to launch spark-shell is ": quit"

Because shell is running, we can also use the

SparkMaster_IP:4040 (172.17.0.2purl 4040)

Visit WebUI to view the tasks currently being performed.

Do the port mapping first:

Iptables-t nat-A DOCKER-p tcp-- dport 4040-j DNAT-- to-destination 172.17.0.2 DOCKER 4040

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.