In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
I. Overview
1. The experimental environment is based on the previously built haoop HA.
The zookeeper environment required for 2.spark HA has been configured previously and will not be repeated here.
3. The required software packages are: scala-2.12.3.tgz, spark-2.2.0-bin-hadoop2.7.tar
4. Host planning
Bd1
Bd2
Bd3
Worker
Bd4
Bd5
Master 、 Worker
2. Configure Scala
1. Decompress and copy
[root@bd1 ~] # tar-zxf scala-2.12.3.tgz [root@bd1 ~] # cp-r scala-2.12.3 / usr/local/
two。 Configure environment variables
[root@bd1 ~] # vim / etc/profileexport SCALA_HOME=/usr/local/scalaexport PATH=:$SCALA_HOME/bin:$ path [root @ bd1 ~] # source / etc/profile
3. Verification
[root@bd1] # scala-versionScala code runner version 2.12.3-- Copyright 2002-2017, LAMP/EPFL and Lightbend, Inc.
3. Configure Spark
1. Decompress and copy
[root@bd1 ~] # tar-zxf spark-2.2.0-bin-hadoop2.7.tgz [root@bd1 ~] # cp spark-2.2.0-bin-hadoop2.7 / usr/local/spark
two。 Configure environment variables
[root@bd1 ~] # vim / etc/profileexport SCALA_HOME=/usr/local/scalaexport PATH=:$SCALA_HOME/bin:$ path [root @ bd1 ~] # source / etc/profile
3. There is no need to copy the template to modify the spark-env.sh # file
[root@bd1 conf] # vim spark-env.shexport JAVA_HOME=/usr/local/jdkexport HADOOP_HOME=/usr/local/hadoopexport HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoopexport SCALA_HOME=/usr/local/scalaexport SPARK_DAEMON_JAVA_OPTS= "- Dspark.deploy.recoveryMode=ZOOKEEPER-Dspark.deploy.zookeeper.url=bd4:2181,bd5:2181-Dspark.deploy.zookeeper.dir=/spark" export SPARK_WORKER_MEMORY=1gexport SPARK_WORKER_CORES=2export SPARK_WORKER_INSTANCES=1
4. There is no need to copy the template to modify the spark-defaults.conf # file
[root@bd1 conf] # vim spark-defaults.confspark.master spark://master:7077spark.eventLog.enabled truespark.eventLog.dir hdfs://master:/user/spark/historyspark.serializer org.apache.spark.serializer.KryoSerializer
5. Create a new log file directory in the HDFS file system
Hdfs dfs-mkdir-p / user/spark/historyhdfs dfs-chmod 777 / user/spark/history
6. Modify slaves
[root@bd1 conf] # vim slavesbd1bd2bd3bd4bd5
Fourth, synchronize to other hosts
1. Use scp to synchronize Scala to bd2-bd5
Scp-r / usr/local/scala root@bd2:/usr/local/scp-r / usr/local/scala root@bd3:/usr/local/scp-r / usr/local/scala root@bd4:/usr/local/scp-r / usr/local/scala root@bd5:/usr/local/
two。 Synchronize Spark to bd2-bd5
Scp-r / usr/local/spark root@bd2:/usr/local/scp-r / usr/local/spark root@bd3:/usr/local/scp-r / usr/local/spark root@bd4:/usr/local/scp-r / usr/local/spark root@bd5:/usr/local/
Start the cluster and test the HA
1. The startup sequence is: zookeeper-- > hadoop-- > spark
two。 Start spark
Bd4:
[root@bd4 sbin] # cd / usr/local/spark/sbin/ [root@bd4 sbin] #. / start-all.sh starting org.apache.spark.deploy.master.Master, logging to / usr/local/spark/logs/spark-root-org.apache.spark.deploy.master.Master-1-bd4.outbd4: starting org.apache.spark.deploy.worker.Worker Logging to / usr/local/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-bd4.outbd2: starting org.apache.spark.deploy.worker.Worker, logging to / usr/local/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-bd2.outbd3: starting org.apache.spark.deploy.worker.Worker Logging to / usr/local/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-bd3.outbd5: starting org.apache.spark.deploy.worker.Worker, logging to / usr/local/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-bd5.outbd1: starting org.apache.spark.deploy.worker.Worker Logging to / usr/local/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-bd1.out [root@bd4 sbin] # jps3153 DataNode7235 Jps3046 JournalNode7017 Master3290 NodeManager7116 Worker2958 QuorumPeerMain
Bd5:
[root@bd5 sbin] #. / start-master.sh starting org.apache.spark.deploy.master.Master, logging to / usr/local/spark/logs/spark-root-org.apache.spark.deploy.master.Master-1-bd5.out [root@bd5 sbin] # jps3584 NodeManager5602 RunJar3251 QuorumPeerMain8564 Master3447 DataNode8649 Jps8474 Worker3340 JournalNode
3. Stop the Master process of bd4
[root@bd4 sbin] # kill-9 7017 [root@bd4 sbin] # jps3153 DataNode7282 Jps3046 JournalNode3290 NodeManager7116 Worker2958 QuorumPeerMain
V. Summary
At first I wanted to put Master on bd1 and bd2, but when I started Spark, I found that both nodes were Standby. Then modify the configuration file and transfer it to bd4 and bd5 before running smoothly. In other words, the Master of Spark HA must be on the Zookeeper cluster to function properly, that is, there must be a JournalNode process on that node.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
Today, I will tell you how to install the 5G base station.
© 2024 shulou.com SLNews company. All rights reserved.