The method of Building Ubuntu14.04 Environment in hadoop0.20.2 Cluster 07/12 Update SLTechnology News&Howtos

The method of Building Ubuntu14.04 Environment in hadoop0.20.2 Cluster

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly explains "the method of building Ubuntu14.04 environment in hadoop0.20.2 cluster". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "the method of building Ubuntu14.04 environment in hadoop0.20.2 cluster".

Pre-match preparation

one。 Installation environment: Ubuntu 14.04-three machines in total

One namenode+jobtracker (master) and two datanode + tasktracker (hadoop, salve1)

The following table shows the details of each machine.

Ip username/password hostname

10.60.38.165 hadoop/123456 hadoop

10.60.38.166 hadoop/123456 master

10.60.38.155 hadoop/123456 salve1 (. I can't believe it was misspelled, damn it.

Add under / etc/hosts for each machine

10.60.38.165 hadoop

10.60.38.166 master

10.60.38.155 salve1

Make it possible for machines to ping each other (ping via hostname)

(all the data in ps: / etc/hosts is emptied, leaving only the above three and the first item: 127.0.0.1 localhost)

two。 SSH no password authentication configuration

1. Because ssh is installed by default in most Linux, I installed it by default, but ssh service is not installed. Use sudo apt-get install openssh-server to install ssh service. (make sure the port number 22 is listening: netstat-nat view)

two。 Execute the following command on each machine: ssh-kengen-t rsa-P ""

One enter (save the location of the key under the default path)

The directory / .ssh appears under the current directory of user (the default permission is 700)

Generate two files id_rsa and id_rsa.pub inside

3. Then do the following configuration on the Master node to append the id_rsa.pub to the authorized key

Cat ~ / .ssh/id_rsa.pub > > ~ / .ssh/authorized_keys

4. Next, execute the command on the other two machines in turn to add the generated id_rsa.pub to the authorized_keys under Master.

Cat id_rsa.pub | ssh hadoop@master "cat > > ~ / .ssh/authorized_keys (at this point ssh to another machine requires the password of another machine)

5. Finally, the authorized_keys obtained from the Master machine is added to the / home/hadoop/.ssh directory of each machine.

Scp authorized_keys hadoop@hadoop: ~ / .ssh

Scp authorized_keys hadoop@salve1:~/.ssh

6. After the configuration is complete, you can log in to another machine without password authentication on each machine. So this step is done.

Install jdk and Hadoop on each machine

Configure Hadoop (here my version is hadoop0.20.2,jdk1.7)

-when setting a value in the configuration file, there must be no spaces on both sides, otherwise it will be invalid.

Next, configure the configuration files: hadoop-env.sh, core-site.xml, hdfs-site.xml, mapred-site.xml

PS:

The configuration for each node of the hadoop-env.sh is the same.

For master nodes, you need to configure an additional slaves file to add the ip addresses of each datanode

The following is the configuration of the master node and the Hadoop node (salve1 is similar to Hadoop, only need to modify the ip, here I have mostly adopted the system default configuration, so there is no need to change it)

Hadoop-env.sh:

Export JAVA_HOME=/opt/jdk1.7

(just add it at the end)

Core-site.xml:

Fs.default.name

Hdfs://master:9000

Ps: here when I add the attribute hadoop.tmp.dir to change its default path, the cluster is starting the Times error. If it is not solved, temporarily enable the default path. And look to the gods for advice.

Hdfs-site.xml:

Dfs.webhdfs.enabled

True

Dfs.replication

two

Mapred-site.xml:

Mapred.job.tracker

10.60.38.166:9001

Slaves:

10.60.38.165 hadoop

10.60.38.155 salve1

Masters:

Master

The master node is configured and then copied to each Datanode node.

Note: the master and slaves files here can be copied without change, and the system will automatically ignore them. And the configuration does not need to be changed, because the default configuration has been adopted as far as possible.

Test run

Go through the following steps:

1.create some files (file01.txt, file02.txt)

2.hadoop fs-mkdir input

3.hadoop fs-put file*.txt input

4.hadoop jar / opt/hadoop/hadoop-0.20.2-examples.jar wordcount input output

5.hadoop fs-ls output

6.hadoop fs-cat output/part-r-00000

The best result is:

; slkdfj 1

Aer 1

Apple 1

Are 1

Asfjjjf 1

C++ 1

Fj 1

Hello 2

Java 3

Tantairs 1

World 4

At this point, I believe you have a deeper understanding of "the method of building a Ubuntu14.04 environment for hadoop0.20.2 clusters". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.