Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize pseudo-distributed installation by Spark1.0.0

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces Spark1.0.0 how to achieve pseudo-distributed installation, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.

Download instructions

Software preparation:

Spark-1.0.0-bin-hadoop1.tgz download address: spark1.0.0

Scala-2.10.4.tgz download: Scala 2.10.4

Hadoop-1.2.1-bin.tar.gz download address: hadoop-1.2.1-bin.tar.gz

Jdk-7u60-linux-i586.tar.gz download address: just go to the official website to download, this 1.7.x is fine

Second, installation steps

For hadoop-1.2.1 installation steps, please see: http://my.oschina.net/dataRunner/blog/292584

1. Decompress:

Tar-zxvf scala-2.10.4.tgz mv scala-2.10.4 scalatar-zxvf spark-1.0.0-bin-hadoop1.tgz mv spark-1.0.0-bin-hadoop1 spark

two。 Configure environment variables:

Vim / etc/profile (just add the following on the last line) export HADOOP_HOME_WARN_SUPPRESS=1export JAVA_HOME=/home/big_data/jdkexport JRE_HOME=$ {JAVA_HOME} / jreexport CLASS_PATH=.:$ {JAVA_HOME} / lib:$ {JRE_HOME} / libexport HADOOP_HOME=/home/big_data/hadoopexport HIVE_HOME=/home/big_data/hiveexport SCALA_HOME=/home/big_data/scalaexport SPARK_HOME=/home/big_data/sparkexport PATH=.:$SPARK _ HOME/bin:$SCALA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$JAVA_HOME/bin:$PATH

3. Modify the spark-env.sh file of spark

Cd spark/confcp spark-env.sh.template spark-env.shvim spark-env.sh (just add the following on the last line) export JAVA_HOME=/home/big_data/jdkexport SCALA_HOME=/home/big_data/scalaexport SPARK_MASTER_IP=192.168.80.100export SPARK_WORKER_MEMORY=200mexport HADOOP_CONF_DIR=/home/big_data/hadoop/conf

Then the configuration is finished! It's as simple as that, damn it, a lot of people know it, but there are too few people to share.

Third, testing steps

For hadoop-1.2.1 test steps, please see: http://my.oschina.net/dataRunner/blog/292584

1. Verify scala

[root@master] # scala-versionScala code runner version 2.10.4-- Copyright 2002-2013, LAMP/EPFL [root@master ~] # [root@master big_data] # scalaWelcome to Scala version 2.10.4 (Java HotSpot (TM) Client VM, Java 1.7.00.60). Type in expressions to have them evaluated.Type: help for more information.scala > 1+1res0: Int = 2scala >: Q

two。 Verify spark (start hadoop-dfs.sh first)

[root@master big_data] # cd spark [root@master spark] # cd sbin/start-all.sh (you can also start [root@master spark] $sbin/start-master.sh respectively to see the corresponding interface through http://master:8080/ [root@master spark] $sbin/start-slaves.sh park://master:7077 can see the corresponding interface through http://master:8081/) [root@master spark] # jps [root@ Master ~] # jps4629 NameNode (hadoop) 5007 Master (spark) 6150 Jps4832 SecondaryNameNode (hadoop) 5107 Worker (spark) 4734 DataNode (hadoop) you can see the corresponding interface [root@master big_data] # spark-shellSpark assembly has been built with Hive through http://192.168.80.100:8080/ Including Datanucleus jars on classpath14/07/20 21:41:04 INFO spark.SecurityManager: Changing view acls to: root14/07/20 21:41:04 INFO spark.SecurityManager: SecurityManager: authentication disabled Ui acls disabled Users with view permissions: Set (root) 21:41:04 on 14-07-20 INFO spark.HttpServer: Starting HTTP Server14/07/20 21:41:05 INFO server.Server: jetty-8.y.z-SNAPSHOT14/07/20 21:41:05 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:43343Welcome to _ / / _ / / _ _ _ / `/. _ /\ _ /\\ version 1.0.0 / _ / Using Scala version 2.10.4 (Java HotSpot (TM) Client VM, Java 1.7.0 / 60). Scala > you can see the corresponding interface through http://192.168.80.100:4040/ (upload any file with some English words in it On hdfs) scala > val file=sc.textFile ("hdfs://master:9000/input") 14-07-20 21:51:05 INFO storage.MemoryStore: ensureFreeSpace (608) called with curMem=31527, maxMem=31138775014/07/20 21:51:05 INFO storage.MemoryStore: Block broadcast_1 stored as values to memory (estimated size 608.0 B, free 296.9 MB) file: org.apache.spark.rdd.RDD [String] = MappedRDD [5] at textFile at: 12scala > val count=file.flatMap (line= > line.split (")) .map (word= > (word) ReduceByKey (_ + _) 21:51:14 on 14-07-20 INFO mapred.FileInputFormat: Total input paths to process: 1count: org.apache.spark.rdd.RDD [(String, Int)] = MapPartitionsRDD [10] at reduceByKey at: 14scala > count.collect () 14-07-20 21:51:48 INFO spark.SparkContext: Job finished: collect at: 17, took 2.482381535 sres0: Array [(String, Int)] = Array ((previously-registered,1), (this,3), (Spark,1), (it) 3), (original,1), (than,1), (its,1), (previously,1), (have,2), (upon,1), (order,2), (whenever,1), (it's,1), (could,3), (Configuration,1), (Master's,1), (SPARK_DAEMON_JAVA_OPTS,1), (This,2), (which,2), (applications,2), (register 1), (doing,1), (for,3), (just,2), (used,1), (any,1), (go,1), ((equivalent,1), (Master,4), (killing,1), (time,1), (availability,1), (stop-master.sh,1), (process.,1), (Future,1), (node,1), (the,9), (Workers,1) (however,1), (up,2), (Details,1), (not,3), (recovered,1), (process,1), (enable,3), (spark-env,1), (enough,1), (can,4), (if,3), (While,2), (provided,1), (be,5), (mode.,1), (minute,1), (When,1), (all,2) (written,1), (store,1), (enter,1), (then,1), (as,1), (officially 1).. Scala > scala > count.saveAsTextFile ("hdfs://master:9000/output") (the results are saved to the / output folder on hdfs) scala >: qStopping spark context. [root@master ~] # hadoop fs-ls / Found 3 itemsdrwxr-xr-x-root supergroup 0 2014-07-18 21:10 / home-rw-r--r-- 1 root supergroup 1722 2014-07-18 06:18 / inputdrwxr-xr-x -root supergroup 0 2014-07-20 21:53 / output [root@master] # [root@master] # hadoop fs-cat / output/p*. (mount,1) (production-level,1) (recovery), 1) (Workers/applications,1) (perspective.,1) (so,2) (and,1) (ZooKeeper,2) (System,1) (needs,1) (property Meaning,1) (solution,1) (seems,1) Thank you for reading this article carefully. I hope the article "how to implement pseudo-distributed installation of Spark1.0.0" shared by the editor will be helpful to you. At the same time, I also hope that you will support and pay attention to the industry information channel, and more related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report