In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Spark submit relies on package management!
When using spark-submit, the application's jar package and any jar files included through the-jars option are automatically transferred to the cluster.
Spark-submit-class-master-jars
Spark uses the following URL format to allow different jar package distribution policies.
1. File file:
Absolute path and file:/URIs is the HTTP file server as driver, and each executor pulls files from driver's HTTP server
2. Hdfs method:
Http:,https:,ftp:, pulls files and JAR packages from these given URI
3. Local local method:
URI that starts with local:/ should be a local file for each worker node, which means that there is no network IO overhead and that it works well if it is pushed or shared to each worker large file / JAR file through NFS/GlusterFS, etc.
Note: the JAR packages and files for each SparkContext are copied to the working directory of the executor node, which takes up a lot of space and then needs to be cleaned up.
Under YARN, cleanup occurs automatically. Under Spark Standalone, automatic cleanup can be done by configuring the spark.worker.cleanup.appDataTtl property, which defaults to 72403600.
Users can use the-- packages option to provide a comma-separated maven listing to contain any other dependencies.
Other libraries (or resolvers in SBT) can be added with the-- repositories option (also separated by commas), and these commands can be used in pyspark,spark-shell and spark-submit to include some Spark packages.
For Python, the-- py-files option can be used to distribute .egg, .zip and .py libraries to executors.
Source code reading:
1 、
Object SparkSubmit
2 、
AppArgs. {SparkSubmitAction.= > (appArgs)}
3 、
(args: SparkSubmitArguments): = {(childArgschildClasspathsysPropschildMainClass) = (args) (): = {(args.examples =) {proxyUser = UserGroupInformation.createProxyUser (args.UserGroupInformation.getCurrentUser ()) {proxyUser.doAs (PrivilegedExceptionAction [] () {(): = {(childArgschildClasspathsysPropschildMainClassargs.)}}))
4 、
(jar file = File (uri.getPath) (file.exists ()) {loader.addURL (file.toURI.toURL)} {(file)} _ = > (uri)}}
Then the clue is broken, and the class class that returns to java invokes the jar package.
6. Who calls executor.
(newFiles: HashMap [] newJars: HashMap []) {hadoopConf = SparkHadoopUtil..newConfiguration () synchronized {((nametimestamp))
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.