The Construction method of Spark Eclipse Development Environment 04/29 Update SLTechnology News&Howtos

The Construction method of Spark Eclipse Development Environment

2025-04-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "how to build a Spark Eclipse development environment". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Spark Eclipse development environment build 1 install Spark environment

First, download the Spark compiled version corresponding to the cluster Hadoop version, extract it to the specified location, and pay attention to the user rights.

Enter the unzipped SPARK_HOME directory

Configure SPARK_HOME in / etc/profile or ~ / .bashrc

Cd $SPARK_HOME/conf cp spark-env.sh.template spark-env.sh

Vim spark-env.sh

Export SCALA_HOME=/home/hadoop/cluster/scala-2.10.5export JAVA_HOME=/home/hadoop/cluster/jdk1.7.0_79export HADOOP_HOME=/home/hadoop/cluster/hadoop-2.6.0export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop# Note this place must be specified as IP, otherwise the following eclipse will report when you connect: # All masters are unresponsive! Giving up. This is the wrong one. Enable sparksbin/start-master.shsbin/start-slave.sh in SPARK_MASTER_IP=10.16.112.121SPARK_LOCAL_DIRS=/home/hadoop/cluster/spark-1.4.0-bin-hadoop2.6SPARK_DRIVER_MEMORY=1G2 standalone mode

At this point, you can enter: http://yourip:8080 in the browser to view the Spark cluster.

The default Spark-Master at this time is: spark://10.16.112.121:7077

3 using Scala-Eclipse IDE and Maven to build Spark development environment

First, download Scala-Eclipse IDE and go to scala's official website to download it.

Open IDE and create a new Maven project. Enter pom.xml as follows:

4.0.0 spark.test FirstTrySpark 0.0.1-SNAPSHOT 2.6.0 1.4.0 org.apache.hadoop hadoop-client ${hadoop.version} provided Javax.servlet * org.apache.hadoop hadoop -common 2.6.0 org.apache.hadoop hadoop-mapreduce-client-jobclient 2.6.0 org.apache.spark spark -core_2.10 ${spark.version} src/main/java net.alchim31.maven scala -maven-plugin 3.2.0 compile TestCompile 2.10 Org.apache.maven.plugins maven-assembly-plugin 2.5.5 Jar-with-dependencies Package single Org.apache.maven.plugins maven-compiler-plugin 1.7 1.7 src/main/resources

Create several new Source Folder

Src/main/java # write java code src/main/scala # write scala code src/main/resources # store resource file src/test/java # write test java code src/test/scala # write test scala code src/test/resources # store resource file

At this time, the environment is all set up!

4. Write test code to see if you can connect successfully.

The test code is as follows:

Import org.apache.spark.SparkConfimport org.apache.spark.SparkConfimport org.apache.spark.SparkContext/** * @ author clebeg * / object FirstTry {def main (args: Array [String]): Unit = {val conf = new SparkConf conf.setMaster ("spark://yourip:7077") conf.set ("spark.app.name") "first-tryspark") val sc = new SparkContext (conf) val rawblocks = sc.textFile ("hdfs://yourip:9000/user/hadoop/linkage") println (rawblocks.first)} 5 part error summary most of the problems have been mentioned above Without saying much here, here are a few main questions:

Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Analyze the problem: click on the operation log corresponding to running ID and find the following error:

15-10-10 08:49:01 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT] 15-10-10 08:49:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable15/10/10 08:49:02 INFO spark.SecurityManager: Changing view acls to: hadoop,Administrator15/10/10 08:49:02 INFO spark.SecurityManager: Changing modify acls to: hadoop,Administrator15/10/10 08:49:02 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set (hadoop,Administrator) Users with modify permissions: Set (hadoop, Administrator) 15-10-10 08:49:02 INFO slf4j.Slf4jLogger: Slf4jLogger started15/10/10 08:49:02 INFO Remoting: Starting remoting15/10/10 08:49:02 INFO Remoting: Remoting started Listening on addresses: [akka.tcp://driverPropsFetcher@10.16.112.121:58708] 08:49:02 on 15-10-10 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 58708.Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs (UserGroupInformation.java:1643) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser (SparkHadoopUtil.scala:65) at org.apache.spark.executor.CoarseGrainedExecutorBackend$ .run (CoarseGrainedExecutorBackend.scala:146) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main (CoarseGrainedExecutorBackend.scala:245) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main (CoarseGrainedExecutorBackend.scala) Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready (Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result (Promise.scala:223) at scala Concurrent.Await $$anonfun$result$1.apply (package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn (BlockContext.scala:53) at scala.concurrent.Await$.result (package.scala:107) at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI (RpcEnv.scala:97) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp (CoarseGrainedExecutorBackend.scala:159) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run (SparkHadoopUtil.scala:66) At org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run (SparkHadoopUtil.scala:65) at java.security.AccessController.doPrivileged (Native Method) at javax.security.auth.Subject.doAs (Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs (UserGroupInformation.java:1628)... 4 more15/10/10 08:51:02 INFO util.Utils: Shutdown hook called

Take a closer look at what turned out to be a permission problem: close Hadoop immediately and add:

Hadoop.security.authorization false

The setting can be read by anyone, and the problem is solved immediately.

Java.io.IOException: Could not locate executable null\ bin\ winutils.exe in the Hadoop binaries.

Go to the address http://www.barik.net/archive/2015/01/19/172716/ to download the recompiled version of hadoop2.6 that contains winutils.exe. Be sure to download your own version of Hadoop.

Reduce the pressure to the specified location and set the HADOOP_HOME environment variable. Be sure to restart eclipse. Got it!

Where can I get the data mentioned in this article? The http://bit.ly/1Aoywaq operation code is as follows:

Mkdir linkagecd linkage/curl-o donation.zip http://bit.ly/1Aoywaqunzip donation.zipunzip "block_*.zip" hdfs dfs-mkdir / user/hadoop/linkagehdfs dfs-put block_*.csv / user/hadoop/linkage "Building method of Spark Eclipse Development Environment" ends here. Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.