Installation of three modes of Hadoop 07/15 Update SLTechnology News&Howtos

Installation of three modes of Hadoop

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

Installation of three modes of Hadoop I. three modes of Hadoop

1. Single node mode

two。 Pseudo-distribution pattern, generally used for testing

3. Fully distributed model, cluster environment, commonly used

Hadoop master-slave node decomposition:

II. Installation environment

1. Stand-alone mode, one host, 172.25.22.10

two。 Pseudo distribution mode, one host, 172.25.22.11

3. Fully distributed mode, three hosts, 172.25.22.10 11 12

Software package:

Hadoop-1.2.1.tar.gz

Jdk-7u79-linux-x64.tar.gz

III. Stand-alone mode of configuration steps

Note: when configuring the environment, it is best to configure it as an ordinary user. All three modes run as an ordinary user hadoop.

In stand-alone mode, there is only one node, so you only need to extract the hadoop package to the appropriate location. This time, the decompressed directory is under / home/hadoop, and then make a soft link to switch directories later. And modify the location of the file directory in the configuration file

Ln-s hadoop-1.2.1/ hadoop

Ln-s jdk1.7.0_79/ java

two。 Modify the configuration file

Write in the directory of java

3. The next step is to configure ssh to login to this computer without a password.

Ssh-keygen

Ssh-copy-id 172.25.22.10

After the configuration is completed, you can ssh localhost directly to this machine.

4. After the configuration is complete, test

First set up an input directory

Mkdir input

Cp conf/*.xml input/

Take an example to test it.

Bin/hadoop jar hadoop-examples-1.2.1.jar grep input output 'dfs [a murz.] +'

The directory output by Output can be generated automatically

You can see this directory in hadoop, and you can view it.

If you look at the web page,

Http://172.25.22.10:50070/dfshealth.jsp can see the files saved in hdfs

Http://172.25.22.10:50030/jobtracker.jsp can see the process being processed.

Four. configuration of pseudo-distribution mode

Since the pseudo-distribution mode is also on one node, several configuration files are configured in single-node mode.

1. Edit the configuration file under its conf

# vim core-site.xml add content

Fs.default.name

Hdfs://172.25.22.10:9000

# vim mapred-site.xml

Mapred.job.tracker

172.25.22.10:9001

# vim hdfs-site.xml

Dfs.replication

one

Configuration file modification completed

two。 Format hadoop

Bin/hadoop namenode-format

3. Start all processes

Bin/start-all.sh

4. View the process

/ home/hadoop/java/bin/jps

All the processes are on this node.

JobTracker # is responsible for task scheduling

TaskTracker # is responsible for data processing

SecondaryNameNode

NameNode # contains metadata information

DataNode # data node, storing data

5. test

Bin/hadoop-put conf/ input

Bin/hadoop jar hadoop-examples-1.2.1.jar grep input output 'dfs [a murz.] +'

The input here is completely different from the previous input, which is stored in the distributed file system.

You can see it with a web page.

Three mainframes

Five. fully distributed configuration

Three hosts:

172.25.22.10 master

172.25.22.11 slave

172.25.22.12 slave

The configuration of the three hosts is the same, so how do you make the slave host and master host configuration files the same? The Scp approach is OK, but it will be troublesome, so considering nfs sharing

1. On the basis that the three configuration files of pseudo-distribution mode have been configured, configure the master slave file

two。 Configure nfs file sharing

On three hosts, the first service that needs to be installed

Yum install-y rpcbind

Yum install-y nfs-utils

Enable the rpcbind service

3. Configure the exports file

/ home/hadoop 172.25.22.0 Compact 255.255.255.0 (rw,anonuid=900,anongid=900)

Exports-v

Exportfs-rv

4. Because the slave node needs to synchronize with the files of the master node

So on the slave node

Showmount-e 172.25.22.10

Mount 172.25.22.10:/home/hadoop/ / home/hadoop/

You can see that the files are all synchronized.

5. Set the password for the ordinary user hadoop on the master node

And then

Ssh-keygen

Ssh-copy-id 172.25.22.10

In this way, the password-free interaction of three hosts logged in as ordinary users can be realized.

The configuration is complete and needs to be tested:

Note: if hadoop is installed under the root user at the beginning, then if you want to build it under the ordinary user, first create a user with the same name on the three hosts, and keep the uid gid consistent, and then migrate the directories done under root

Mv hadoop-1.2.1/ / home/hadoop/

And modify the home group and home users of this directory

Chown hadoop.hadoop / home/hadoop/*-R

Soft link and modification of java environment

6. Test it in a distributed environment

Format first, and then open all processes

Bin/hadoop namenode-format

Bin/start-all.sh

Look at the progress of each node

Test the program and execute a wordcount example

Bin/hadoop fs-put conf/ input

Bin/hadoop jar hadoop-examples-1.2.1.jar wordcount input output

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.