Building HA Cluster of spark 04/22 Update SLTechnology News&Howtos

Building HA Cluster of spark

2025-04-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Building distributed clusters: https://blog.51cto.com/14048416/2327802

tries to build a common distributed cluster based on spark, and there is a single point of failure of master nodes. Starting with Hadoop2.x, zookeeper has been used to resolve a single point of failure. In the same strategy, spark also uses zookeeper to solve the single point of failure of spark clusters.

1. Planning of the cluster (here 3 machines are used for testing)

two。 Specific construction steps:

If ① is already in use and spark distributed cluster is started, please stop it manually:

$SPARK_HOME/sbin/stop-all.sh

② configure and start the zookeeper cluster: https://blog.51cto.com/14048416/2336178

③ modifies the spark-env.sh configuration file in the $SPARK_HOME/conf directory

If there is a distributed cluster before: delete: * * export SPARK_MASTER_HOST=xxx** join: export SPARK_DAEMON_JAVA_OPTS= "- Dspark.deploy.recoveryMode=ZOOKEEPER-Dspark.deploy.zookeeper.url=hadoop01,hadoop02,hadoop03-Dspark.deploy.zookeeper.dir=/spark"

Explanation of relevant parameters:

-Dspark.deploy.recoveryMode=ZOOKEEPER: indicates that the entire cluster state is maintained through zookeeper, and the recovery of the entire cluster state is also maintained through zookeeper. That is to say, if you use zookeeper to configure the HA of Spark, if Master (Active) dies, if Master (standby) wants to become Master (Active), Master (Standby) will read the entire cluster status information like zookeeper, and then recover all Worker and Driver status information, and all Application status information.

-Dspark.deploy.zookeeper.url: configure all machines that are configured with zookeeper and are likely to do master (Active) on this machine

-znode that stores the metadata of spark in Dspark.deploy.zookeeper.dir:zookeeper, and saves the job running status of spark; zookeeper stores all the status information of spark cluster, including all Workers information, all Applactions information, all Driver information, if the cluster.

④ if the hadoop cluster is a highly available cluster, be sure to place core-site.xml and hdfs-site.xml in the $SPARK_HOME/conf directory, and then synchronize all spark nodes.

⑤ synchronization profile:

Here, on the basis that spark distribution has been installed, only spark-env.sh can be synchronized: (installation of spark set distributed cluster: https://blog.51cto.com/14048416/2327802)

Scp-r $SPARK_HOME/confspark-env.sh hadoop02:$SPARK_HOME/conf

Scp-r $SPARK_HOME/confspark-env.sh hadoop03:$SPARK_HOME/conf

⑥ starts the cluster:

$SPARK_HOME/sbin/start-all.sh

Note:

At this point, by observing the startup log or checking whether the master process is included on the hadoop02, you can know that the master on the hadoop02 will not start automatically, so you need to start it manually. Then execute the command in hadoop02 to start: $SPARK_HOME/sbin/start-master.sh

⑦ verifies the high availability of the cluster:

Master node (active) hadoop01:

Hadoop02 from the node (standby):

two。 Permanently configure the log level of the spark cluster:

from the point of view of the running of the spark program we are running, we can see a lot of INFO-level log information. Inundated. We need to run the output. You can Spark the log level by modifying the Spark configuration file.

Specific steps:

Enter: cd $SPARK_HOME/conf

Prepare log4j.properties:cp log4j.properties.template log4j.properties

Configure the log level:

Change INFO to the level you want: there are mainly ERROR, WARN, INFO, DEBUG

Just restart the cluster.

The basic use of 3.spark 's shell: (1) use the example program that comes with Spark to execute a program for finding PI (Monte Carlo algorithm): $SPARK_HOME/bin/spark-submit\-class org.apache.spark.examples.SparkPi\-master spark://hadoop02:7077\-executor-memory 512m\-total-executor-cores 3\ $SPARK_HOME/examples/jars/spark-examples_2.11-2.3.0.jar\ 100 (2) start spark shell

Start local mode: [hadoop@hadoop01 ~] $spark-shell

Start cluster mode:

$SPARK_HOME/bin/spark-shell\-- master spark://hadoop02:7077,hadoop04:7077\ # specify the address of the Master-- executor-memory 512m\ # specify each worker available memory for 512M--total-executor-cores 2 # specify 2 cup cores used by the entire cluster

Note:

-at run time: executor-memory cannot exceed the memory of nodes in the cluster.

-total-executor-cores: do not exceed the total cpu cores that the spark cluster can provide, otherwise all will be used. It's best not to use all of them. Otherwise, other programs will not run properly because there is no cpu core available.

(3) solve the shortage of memory resources.

If the underlying yarn task resource management is used:

Modify yarn-site.xml:

Join:

Yarn.nodemanager.vmem-check-enabled false Whether virtual memory limits will be enforced for containers yarn.nodemanager.vmem-pmem-ratio 4 Ratio between virtual memory to physical memory when setting memory limits for containers

Then restart yarn.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.