How to use spark1 and hadoop1 04/29 Update SLTechnology News&Howtos

How to use spark1 and hadoop1

2025-04-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

How to use spark1 and hadoop1, for this question, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible way.

Environmental preparation

Ubuntu-12.04.1-desktop-i386.iso

Jdk-7u7-linux-i586.tar.gz

Paohaijiao@ubuntu:~$ sudo-s

[sudo] password for paohaijiao:

Root@ubuntu:~# vi / etc/lightdm/lightdm.conf

[SeatDefaults]

User-session=ubuntu

Greeter-session=unity-greeter

Greeter-show-manual-login=true

Allow-guest=false

Root@ubuntu:~# sudo passwd root

Enter new UNIX password:

Retype new UNIX password:

Passwd: password updated successfully

Root@ubuntu:~# reboot-h now

Root@ubuntu:~# mkdir / usr/lib/java

Root@ubuntu:~# getconf LONG_BIT

thirty-two

Root@ubuntu:~# cd / usr/lib/java

Root@ubuntu:/usr/lib/java# ls

Jdk1.7.0_07 jdk-7u7-linux-i586.tar.gz

Export JAVA_HOME=/usr/lib/java/jdk1.7.0_07

Export JRE_HOME=$ {JAVA_HOME} / jre

Export CLASS_PATH=.:$ {JAVA_HOME} / lib:$JRE__HOME} / lib

Export PATH=$ {JAVA_HOME} / bin:$PATH

Root@ubuntu:/usr/lib/java# source / .bashrc

Root@ubuntu:/usr/lib/java# apt-get install ssh

Root@ubuntu:~# / etc/init.d/ssh start

Rather than invoking init scripts through / etc/init.d, use the service (8)

Utility, e.g. Service ssh start

Since the script you are attempting to invoke has been converted to an

Upstart job, you may also use the start (8) utility, e.g. Start ssh

Root@ubuntu:~# ps-e | grep ssh

2174? 00:00:00 ssh-agent

3579? 00:00:00 sshd

Root@ubuntu:~# ssh-keygen-t rsa-P ""

Generating public/private rsa key pair.

Enter file in which to save the key (/ root/.ssh/id_rsa):

Created directory'/ root/.ssh'.

Your identification has been saved in / root/.ssh/id_rsa.

Your public key has been saved in / root/.ssh/id_rsa.pub.

The key fingerprint is:

Bf:d2:3c:20:7b:b0:6d:4f:7f:9d:98:cb:b7:26:c1:67 root@ubuntu

The key's randomart image is:

+-[RSA 2048]-+

| | |

| S. | |

| o. O E |

| | * + o *. |

| | o =. + o.room.resume. | |

| | o oo..+=.. |

+-+

Root@ubuntu:~# cat ~ / .ssh/id_rsa.pub > > ~ / .ssh/authorized_keys

Root@ubuntu:~# ssh localhost

The authenticity of host 'localhost (127.0.0.1)' can't be established.

ECDSA key fingerprint is 0d:1a:18:04:c8:6b:0b:d7:98:e8:f4:a4:f6:e3:2a:8c.

Are you sure you want to continue connecting (yes/no)? Yes

Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.

Welcome to Ubuntu 12.04.1 LTS (GNU/Linux 3.2.0-29-generic-pae i686)

* Documentation: https://help.ubuntu.com/

The programs included with the Ubuntu system are free software

The exact distribution terms for each program are described in the

Individual files in / usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by

Root@ubuntu:~# apt-get install rsync

Reading package lists... Done

Building dependency tree

Reading state information... Done

The following packages will be upgraded:

Rsync

1 upgraded, 0 newly installed, 0 to remove and 612 not upgraded.

Need to get 299 kB of archives.

After this operation, 5120 B of additional disk space will be used.

Get:1 http://us.archive.ubuntu.com/ubuntu/ precise-updates/main rsync i386 3.0.9-1ubuntu1.1 [299 kB]

Root@ubuntu:~# mkdir / usr/local/hadoop

Root@ubuntu:~# cd / root/Downloads/

Root@ubuntu:~/Downloads# ls

Hadoop-1.2.1-bin.tar.gz

Root@ubuntu:~/Downloads# tar-xzf hadoop-1.2.1-bin.tar.gz

Root@ubuntu:~/Downloads# ls

Hadoop-1.2.1 hadoop-1.2.1-bin.tar.gz

Root@ubuntu:~/Downloads# mv hadoop-1.2.1 / usr/local/hadoop

Root@ubuntu:~/Downloads# cd / usr/local/hadoop

Root@ubuntu:/usr/local/hadoop# ls

Hadoop-1.2.1

Root@ubuntu:/usr/local/hadoop# cd hadoop-1.2.1

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1# ls

Bin hadoop-ant-1.2.1.jar hadoop-tools-1.2.1.jar NOTICE.txt

Build.xml hadoop-client-1.2.1.jar ivy README.txt

C++ hadoop-core-1.2.1.jar ivy.xml sbin

CHANGES.txt hadoop-examples-1.2.1.jar lib share

Conf hadoop-minicluster-1.2.1.jar libexec src

Contrib hadoop-test-1.2.1.jar LICENSE.txt webapps

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1# cd conf/

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1/conf# ls

Capacity-scheduler.xml hadoop-policy.xml slaves

Configuration.xsl hdfs-site.xml ssl-client.xml.example

Core-site.xml log4j.properties ssl-server.xml.example

Fair-scheduler.xml mapred-queue-acls.xml taskcontroller.cfg

Hadoop-env.sh mapred-site.xml task-log4j.properties

Hadoop-metrics2.properties masters

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1/conf# vi hadoop-env.sh

Export JAVA_HOME=/usr/lib/java/jdk1.7.0_07

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1/conf# source hadoop-env.sh

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1/bin# vi / .bashrc

Export PATH=$ {JAVA_HOME} / bin:/usr/local/hadoop/hadoop-1.2.1/bin:$PATH

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1/bin# source / .bashrc

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1# mkdir input

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1# ls

Bin hadoop-client-1.2.1.jar ivy sbin

Build.xml hadoop-core-1.2.1.jar ivy.xml share

C++ hadoop-examples-1.2.1.jar lib src

CHANGES.txt hadoop-minicluster-1.2.1.jar libexec webapps

Conf hadoop-test-1.2.1.jar LICENSE.txt

Contrib hadoop-tools-1.2.1.jar NOTICE.txt

Hadoop-ant-1.2.1.jar input README.txt

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1# cp conf/* input

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1# ls input

Capacity-scheduler.xml hadoop-policy.xml slaves

Configuration.xsl hdfs-site.xml ssl-client.xml.example

Core-site.xml log4j.properties ssl-server.xml.example

Fair-scheduler.xml mapred-queue-acls.xml taskcontroller.cfg

Hadoop-env.sh mapred-site.xml task-log4j.properties

Hadoop-metrics2.properties masters

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1/conf# vi core-site.xml

Fs.default.name

Hdfs://localhost:9000

Hadoop.tmp.dir

/ usr/local/hadoop/hadoop-1.2.1/tmp

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1/conf# vi hdfs-site.xml

Dfs.replication

one

Dfs.name.dir

/ usr/local/hadoop/hadoop-1.2.1/hdfs/name

Dfs.data.dir

/ usr/local/hadoop/hadoop-1.2.1/hdfs/data

Vi mapred-site.xml

Mapred.job.tracker

Localhost:9001

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1/conf# hadoop namenode-format

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1/bin#. / start-all.sh

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1# hadoop jar hadoop-examples-1.2.1.jar wordcount input output

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1# cat output/*

Http://d3kbcqa49mib13.cloudfront.net/spark-1.0.0-bin-hadoop1.tgz

Http://www.scala-lang.org/files/archive/scala-2.10.4.tgz[spark1.0.0 requires version scala 2.10.x]

Root@ubuntu:~# mkdir/usr/lib/scala

Bash: mkdir/usr/lib/scala: No such file or directory

Root@ubuntu:~# mkdir / usr/lib/scala

Root@ubuntu:~# cd ~ / Do

Documents/ Downloads/

Root@ubuntu:~# cd ~ / Downloads/

Root@ubuntu:~/Downloads# ls

Hadoop-1.2.1-bin.tar.gz scala-2.10.4.tgz spark-1.0.0-bin-hadoop1.tgz

Root@ubuntu:~/Downloads# tar-zxf scala-2.10.4.tgz

Root@ubuntu:~/Downloads# ls

Hadoop-1.2.1-bin.tar.gz scala-2.10.4.tgz

Scala-2.10.4 spark-1.0.0-bin-hadoop1.tgz

Root@ubuntu:~/Downloads# mv scala-2.10.4 / usr/lib/scala

Root@ubuntu:~/Downloads# cd / usr/lib/scala

Root@ubuntu:/usr/lib/scala# ls

Scala-2.10.4

Root@ubuntu:/usr/lib/scala# vi / .bashrc

Root@ubuntu:/usr/lib/scala# source / .bashrc

S # tar-zxf spark-1.0.0-bin-hadoop1.

Export JAVA_HOME=/usr/lib/java/jdk1.7.0_07

Export JRE_HOME=$ {JAVA_HOME} / jre

Export SCALA_HOME=/usr/lib/scala/scala-2.10.4

Export CLASS_PATH=.:$ {JAVA_HOME} / lib:$JRE__HOME} / lib

Export PATH=$ {SCALA_HOME} / bin:$ {JAVA_HOME} / bin:/usr/local/hadoop/hadoop-1.2.1/bin:$PATH

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1# cd / usr/local/hadoop/hadoop-1.2.1

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1# hadoop jar hadoop-examples-1.2.1.jar wordcount input output

Root@ubuntu:~/Downloads# cd / root/Downloads

Root@ubuntu:~/soft# cd spark-1.0.0-bin-hadoop1/

Root@ubuntu:~/soft/spark-1.0.0-bin-hadoop1# ls

Bin conf examples LICENSE python RELEASE

CHANGES.txt ec2 lib NOTICE README.md sbin

Root@ubuntu:~/soft/spark-1.0.0-bin-hadoop1# cd conf/

Root@ubuntu:~/soft/spark-1.0.0-bin-hadoop1/conf# ls

Fairscheduler.xml.template slaves

Log4j.properties.template spark-defaults.conf.template

Metrics.properties.template spark-env.sh.template

Root@ubuntu:~/soft/spark-1.0.0-bin-hadoop1/conf# vi spark-env.sh

Export JAVA_HOME=/usr/lib/java/jdk1.7.0_07

Export SCALA_HOME=/usr/lib/scala/scala-2.10.4

Export SPARK_MASTER_IP=192.168.141.138

Export SPARK_WORKER_MEMORY=2g

Export HADOOP_CONF_DIR=/usr/local/hadoop/hadoop-1.2.1/conf

Root@ubuntu:~/soft/spark-1.0.0-bin-hadoop1/conf# vi slaves

# A Spark Worker will be started on each of the machines listed below.

Localhost

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1#. / bin/start-all.sh

Starting namenode, logging to / usr/local/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-root-namenode-ubuntu.out

Root@ubuntu:/usr/local/hadoop/hadoop-1.2.1/bin# jps

3459 Jps

3207 JobTracker

2714 NameNode

3122 SecondaryNameNode

2923 DataNode

3411 TaskTracker

Root@ubuntu:~/soft/spark-1.0.0-bin-hadoop1/sbin# cd / root/soft/spark-1.0.0-bin-hadoop1/sbin

Root@ubuntu:~/soft/spark-1.0.0-bin-hadoop1/sbin# jps

3918 Jps

3207 JobTracker

3625 Master

3844 Worker

3411 TaskTracker

Root@ubuntu:~/soft/spark-1.0.0-bin-hadoop1# jps

3207 JobTracker

3625 Master

4483 SecondaryNameNode

3844 Worker

4734 Jps

3411 TaskTracker

Visit http://192.168.141.138:8080/

Root@ubuntu:~/soft/spark-1.0.0-bin-hadoop1/bin#. / spark-shell

Http://192.168.141.138:4040/environment/

Root@ubuntu:~/soft/spark-1.0.0-bin-hadoop1# hadoop dfs-copyFromLocal README.md. /

Http://192.168.141.138:50070/dfshealth.jsp

Scala > val file=sc.textFile ("hdfs://127.0.0.1:9000/user/root/README.md")

Scala > val sparks=file.filter (line= > line.contains ("Spark") scala > val sparks=file.filter (line= > line.contains ("Spark")

This is the answer to the question about how to use spark1 and hadoop1. I hope the above content can be of some help to you. If you still have a lot of doubts to solve, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.