How to build spark under windows 04/27 Update SLTechnology News&Howtos

How to build spark under windows

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "how to build spark under windows". The content is simple and clear. I hope it can help you solve your doubts. Let me lead you to study and learn this article "how to build spark under windows".

System: windows x64

Memory: 4G

Spark version: spark-1.6.0-bin-hadoop2.6

JDK version: jdk1.7.0_031

Spark installation steps:

1.spark installation package, download address https://spark.apache.org/downloads.html

two。 Unpack the downloaded installation package, and it seems that the path had better not contain spaces (what happens if you haven't tried to include spaces)

3. The bin directory under SPARK_HOME specified in the configuration SPARK_HOME,path in the environment variable.

4. Now you are running in local mode, so you don't have to install hadoop, but you need to configure HADOOP_HOME under windows and distribute a winutils.exe file in HADOOP_HOME/bin. For more information, please see https://github.com/spring-projects/spring-hadoop/wiki/Using-a-Windows-client-together-with-a-Linux-cluster.

5. Open CMD and see if the spark-shell command runs successfully.

May encounter problem 1: the problem with the xerces.jar package, which may be caused by the jar package conflict, the most direct solution: re-download a jdk

You may encounter a problem 2:spark-shell throws java.lang.RuntimeException: The root scratch dir: / tmp/hive on HDFS should be writable. This error seems to be a BUG of hive. For more information, please see https://issues.apache.org/jira/browse/SPARK-10528

Configure the java development environment for Eclipse:

Java develops spark programs by relying on a jar, located in SPARK_HOME/lib/spark-assembly-1.6.0-hadoop2.6.0.jar, and can be directly imported into eclipse. Spark only supports more than 1.6 java rings.

Finally, a WordCount program is attached, which is a file read and output through hdfs.

SparkConf conf = new SparkConf (). SetAppName ("WordCount"). SetMaster ("local"); JavaSparkContext context = new JavaSparkContext (conf); JavaRDD textFile = context.textFile ("hdfs://192.168.1.201:8020/data/test/sequence/sequence_in/file1.txt"); JavaRDD words = textFile.flatMap (new FlatMapFunction () {public Iterable call (String s) {return Arrays.asList (s.split ("));}}) JavaPairRDD pairs = words.mapToPair (new PairFunction () {public Tuple2 call (String s) {return new Tuple2 (s, 1);}}); JavaPairRDD counts = pairs.reduceByKey (new Function2 () {public Integer call (Integer a, Integer b) {return a + b;}}); counts.saveAsTextFile ("hdfs://192.168.1.201:8020/data/test/sequence/sequence_out/") These are all the contents of the article "how to build spark under windows". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.