Sample Analysis of fully distributed installation of Spark1.6.1 and Hadoop2.6.4 04/09 Update SLTechnology News&Howtos

Sample Analysis of fully distributed installation of Spark1.6.1 and Hadoop2.6.4

2025-04-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the example analysis of Spark1.6.1 and Hadoop2.6.4 fully distributed installation, which is very detailed and has certain reference value. Friends who are interested must read it!

Preparation: the following installation packages can be downloaded from the official website

Hadoop-2.6.4.tar.gz jdk-7u71-linux-x64.tar.gz scala-2.10.4.tgz spark-1.6.1-bin-hadoop2.6.tgz

My hardware environment is:

Master: virtual kernel 8 memory 16.0GB slave1: virtual kernel 4 memory 10.0GB slave2: virtual kernel 4 memory 10.0GB slave3: virtual kernel 4 memory 10.0GB slave4: virtual kernel 4 memory 10.0GB

Name the 5 machines master, slave1, slave2, slave3, slave4:

Sudo vim / etc/hostnamemaster on the master computer

After configuring all five machines with the same hosts:

Sudo vim / etc/hosts127.0.0.1 localhost127.0.1.1 master/slave1/...192.168.80.70 master192.168.80.71 slave1 192.168.80.72 slave2 192.168.80.73 slave3 192.168.80.74 slave4

After configuration, restart, and then ping slave1 on master.

Configure ssh:

For all nodes, use ssh-keygen-t rsa to press enter all the way. ① puts the public key into authorized_keys on master. The command: sudo cat id_rsa.pub > > authorized_keys ② places the authorized_keys on master in the ~ / .ssh directory of the other linux. Command: scp authorized_keys root@salve1:~/.ssh ③ modify authorized_keys authority, command: chmod 644 authorized_keysssh localhost and ssh master ④ test whether the user name and password is successfully entered by ssh slave1, then exit, ssh host2 again without a password, directly enter the system. This means success. All nodes turn off firewall ufw disable

Edit the configuration file:

Vim / etc/profileexport JAVA_HOME=/usr/lib/jvm/jdk1.7.0_71export PATH=JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATHexport CLASSPATH=$CLASSPATH:.:$JAVA_HOME/lib:$JAVA_HOME/jre/libexport SCALA_HOME=/opt/scala/scala-2.10.4export PATH=/opt/scala/scala-2.10.4/bin:$PATHexport PATH=$PATH:$JAVA_HOME/binexport HADOOP_HOME=/root/hadoop-2.6.4export HADOOP_COMMON_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=$HADOOP_HOMEexport HADOOP_MAPRED_HOME=$HADOOP_HOMEexport HADOOP_YARN_HOME=$HADOOP_HOMEexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOOME/sbin:$HADOOP_HOME/libexport HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS= "- Djava.library.path=$HADOOP_HOME/lib" export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbinexport SPARK_HOME=/root/spark-1.6.1-bin-hadoop2.6export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbinsource / etc/profilevim hadoop-env.shexport JAVA_HOME=/usr/lib/jvm/jdk1.7.0_71export HADOOP_CONF_DIR=/root/hadoop-2.6.4/etc/hadoop/ source hadoop-env.shvim yarn-env.shexport JAVA_HOME=/usr/lib/jvm/jdk1.7.0_71source yarn-env.shvim spark-env.shexport SPARK_MASTER_IP=masterexport SPARK_MASTER_PORT=7077export SPARK_WORKER_CORES=4export SPARK_WORKER_MEMORY=4gexport SPARK_WORKER_INSTANCES=2export JAVA_HOME=/usr/lib/jvm/jdk1 . 7.0_71export SCALA_HOME=/opt/scala/scala-2.10.4export HADOOP_HOME=/root/hadoop-2.6.4source spark-env.sh

Both Spark and Hadoop need to modify slaves

Vim slavesslave1slave2slave3slave4

Hadoop related configuration:

Vim core-site.xml hadoop.tmp.dir/root/hadoop-2.6.4/tmpfs.default.namehdfs://master:9000vim hdfs-site.xml dfs.http.addressmaster:50070dfs.namenode.secondary.http-addressmaster:50090dfs.replication1vim mapred-site.xmlmapred.job.trackermaster:9001 mapred.map.tasks20mapred.reduce.tasks4mapreduce.framework.nameyarnmapreduce.jobhistory.addressmaster:10020mapreduce.jobhistory.webapp.addressmaster:19888vim yarn-site.xml yarn.resourcemanager.addressmaster:8032yarn.resourcemanager.scheduler.address master:8030 yarn.resourcemanager. Webapp.address master:8088yarn.resourcemanager.resource-tracker.addressmaster:8031yarn.resourcemanager.admin.addressmaster:8033yarn.nodemanager.aux-servicesmapreduce_shuffleyarn.nodemanager.aux-services.mapreduce.shuffle.classorg.apache.hadoop.mapred.ShuffleHandler

After configuring the above, distribute the above two unzipped packages to the slave1~slave4 node on the master node:

Scp-r spark-1.6.1-bin-hadoop2.6 root@slave1:~/scp-r hadoop-2.6.4 root@slave1:~/

Note that ssh should be configured in advance. Hadoop running tests will not be discussed here. Note that the jps command checks the status.

Start the test Spark

. / sbin/start-all.sh

Test the example that comes with Spark

. / bin/spark-submit-- master spark://master:7077-- class org.apache.spark.examples.SparkPi / root/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar

Test Spark shell

. / bin/spark-shell-- master spark://master:7077 above is all the content of the article "sample Analysis of a fully distributed installation of Spark1.6.1 and Hadoop2.6.4". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.