In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces the knowledge of "how to specify Driver and Executor in Spark2 to use the specified range of ports". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
1. Purpose of document writing
When submitting Spark jobs in CDH clusters, we all know that the communication ports between Driver and Executor of Spark are random, and Spark will choose the ports between 1024 and 65535 (inclusive), so it is not recommended to enable firewalls between clusters. This article, Fayson, mainly describes how to specify that Driver and Executor in Spark2 jobs use ports within a specified range for communication.
Content Overview
1. Configure Spark Driver and Executor port ran
two。 Verify port assignment
Test environment
Version 5.15 for 1.CM and CDH
2.Spark version 2.2.0
two。 Configure Spark Driver and Executor port ran
1. Log in to the CM management interface and enter the Spark service configuration interface
two。 The configuration in the Gateway category also searches for "spark-defaults.conf" and adds the following configuration:
Spark.driver.port=10000
Spark.blockManager.port=20000
Spark.port.maxRetries=999
(can slide left and right)
3. Save the configuration and redeploy the client configuration for Spark2
3. Verify port assignment
1. Submit a Spark2 job to the cluster
Spark2-submit-- class org.apache.spark.examples.SparkPi\
-master yarn-num-executors 4-driver-memory 1g\
-- driver-cores 1-- executor-memory 1g-- executor-cores 1\
/ opt/cloudera/parcels/SPARK2/lib/spark2/examples/jars/spark-examples_2.11-2.2.0.cloudera2.jar 10000
(can slide left and right)
two。 Check the running interface of the Spark job to see the port numbers used by Driver and Executor
4. Summary
This article Fayson mainly uses Spark2 as an example to illustrate how to restrict Driver and Executor from using port numbers within a specified range. Careful friends can see that Driver will listen on two ports when it starts. In the example, they are 10001 and 20000. The explanations for these two ports are as follows:
Port 10001 of 1.spark.driver.port is used to listen for requests from executor. When executor is up, it needs to communicate with drive and obtain specific task information. It is a management and scheduling port used by driver.
2.spark.blockManager.port (20000) ports are driver and executor direct data transfer ports (such as cached data frame, broadcast vars).
3. While Spark is running, blockManager will not interact with YARN, while driver will interact with Application Master processes running in YARN.
The port number of the Executor specified in 4.Spark2 is specified by spark.blockManager.port, which is different from the (spark.executor.port) specified parameter of Spark1.
This is the end of the content of "how to specify Driver and Executor in Spark2 to use the specified range of ports". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.