How to deploy Spark clusters 04/11 Update SLTechnology News&Howtos

How to deploy Spark clusters

2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

Today, I will talk to you about how to deploy Spark clusters. Many people may not know much about it. In order to let you know more, Xiaobian summarized the following contents for you. I hope you can gain something according to this article.

1. Installation environment profile

Hardware environment: two virtual machines with quad-core CPU, 4G memory and 500G hard disk.

Software environment: 64-bit Ubuntu 12.04 LTS; host names are spark1, spark2, IP addresses are 1**.1*.***/** *。JDK version 1.7. Hadoop 2.2 has been successfully deployed on the cluster. For details, please refer to Yarn Installation and Deployment.

2. Install Scala 2.9.3

1) Run wget http://www.scala-lang.org/downloads/distribut/files/scala-2.9.3.tgz in/home/test/spark directory to download scala binary package.

2) Extract the downloaded file and configure the environment variables: edit the/etc/profile file and add the following contents:

export SCALA_HOME=/home/test/spark/scala/scala-2.9.3 export PATH=$SCALA_HOME/bin

3) Run source /etc/profile to make changes to environment variables take effect immediately. Do the same on spark2, installing scala.

3. Download the compiled spark file at http://d3kbcqa49mib13.cloudfront.net/spark-0.8.1-incubating-bin-hadoop2.tgz. Download and extract.

4. Configure the conf/spark-env.sh environment variable and add the following:

export SCALA_HOME=/home/test/spark/scala/scala-2.9.3

5. Configure SPARK_EXAMPLES_JAR and spark environment variables in/etc/profile: Add the following:

export SPRAK_EXAMPLES_JAR=/home/test/spark/spark-0.8.1-incubating-bin-hadoop2/examples/target/scala-2.9.3/spark-examples_2.9.3-assembly-0.8.1-incubating.jar export SPARK_HOME=/home/test/spark/spark-0.8.1-incubating-bin-hadoop2 export PATH=$SPARK_HOME/bin

6. Modify the/conf/slaves file to add the following:

spark1 spark2

7. Use scp to copy the above file to the same path on the spark node below scp -rspark-0.8.1-incubating-bin-hadoop2 test@spark2:/home/test/spark:

8. Start the spark cluster on spark1 and check if the process started successfully. The following master and worker have been successfully launched.

Open http://1 <$<$.<$: 8080/, which is displayed as follows:

You can see that two slave nodes in the cluster have been successfully started.

9. Examples of running spark include: ./ run-exampleorg.apache.spark.examples.SparkPi spark://master:7077, the result is as follows:

You can see the job running just now in the web interface as follows:

After reading the above, do you have any further understanding of how to deploy Spark clusters? If you still want to know more knowledge or related content, please pay attention to the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.