How to build Hadoop distributed Cluster 04/28 Update SLTechnology News&Howtos

How to build Hadoop distributed Cluster

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to build a Hadoop distributed cluster. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.

The steps to build a Hadoop distributed cluster environment are as follows

Experimental environment:

System: win7

Memory: 8 gigabytes (no less than 8 gigabytes is recommended because you want to turn on a virtual machine)

Hard drives: solid state recommended

Virtual machine: VMware 12

Linux:Centos 7

Jdk1.7.0_67

Hadoop-2.5.0.tar.gz

1. Install the VMware virtual machine environment

two。 Install the Centos operating system

3. Modify hostname configuration network

4. Configure ssh password-less login

5. Upload jdk configuration environment variables

6. Upload hadoop configuration environment variables

7. Modify hadoop configuration file

8. Format namenode

9. Start hadoop and test

1 install the VMware virtual machine environment

This step is simple. Download the installation package directly. Next, open and enter the registration code after the installation is successful.

5A02H-AU243-TZJ49-GTC7K-3C61N

GA1T2-4JF1P-4819Y-GDWEZ-XYAY8

FY1M8-6LG0H-080KP-YDPXT-NVRV2

ZA3R8-0QD0M-489GP-Y5PNX-PL2A6

FZ5XR-A3X16-H819Q-RFNNX-XG2EA

ZU5NU-2XWD2-0806Z-WMP5G-NUHV6

VC58A-42Z8H-488ZP-8FXZX-YGRW8

2 install Centos operating system

I installed 3 Linux here, one as namenode, two as datanode, using Centos7 64-bit, it is recommended to use Centos, the reason is very simple, free and open source, is a heavyweight Linux, closer to the production environment, of course, other versions are also possible.

Download address: http://isoredirect.centos.org/centos/7/isos/x86_64/CentOS-7-x86_64-DVD-1611.iso

The installation process is simple, so I won't go into details here.

Or I would like to recommend the big data Learning Exchange I created by myself Qun: 710219868 there are bosses and materials, enter the Qun chat invitation code and fill in Nanfeng (required)

There is a sharing open class with learning routes. After listening to it, you will know how to learn from big data.

3. Modify hostname configuration network

Namenode:master

Datanode:slave1 slave2

Execute the following command

Vi / etc/hostname

Change localhost to master

The other two sets were changed to slave1 and slave2 respectively.

Then execute the following command

Vi / etc/hosts

Add the ip and hostname of the three linux

192.168.149.138 master

192.168.149.139 slave1

192.168.149.140 slave2

Fill in the above address according to your host

4 configure ssh password-less login

Execute the following command on master

Ssh-keygen

After the previous step, the public key and private key will be generated

Cd / .ssh

When you execute the ll command in the .ssh directory, you will see two files, id_rsa and id-rsa.pub, the first is the private key and the second is the public key

And then execute

Ssh-copy-id-I / root/.ssh/id_rsa.pub root@192.168.149.139

Ssh-copy-id-I / root/.ssh/id_rsa.pub root@192.168.149.140

This copies the public key to the other two linux.

Then test whether it is successful.

Ssh 192.168.149.139

If you do not need a password, the configuration is successful

5. Upload jdk configuration environment variables

Upload jdk to centos

Execute the following command

Tar-zxvf jdk1.7.0_67

Vi / etc/profile

After configuring the Java environment variable

Source / etc/profile let the configuration take effect

Check to see if it is a match.

Java-version

6. Upload hadoop configuration environment variables

Upload the hadoop installation package to centos

Execute the following command

Tar-zxvf hadoop-2.5.0.tar.gz

Mv hadoop-2.5.0.tar.gz hadoop renaming

Configure the hadoop environment variable as Java. The PATH paths are bin and sbin.

Check to see if it is successful

Hadoop version

7 modify hadoop configuration file

The files that need to be modified are hadoop-env.sh core-site.xml hdfs-site.xml mapred-site.xml

Yarn-site.xml

Hadoop-env.sh

Add the JAVA_HOME path to it

Core-site.xml

Fs.default.name

Hdfs://master:9000

Hadoop.tmp.dir

/ usr/local/hadoop/tmp

one

two

three

four

five

six

seven

eight

nine

ten

Hdfs-site.xml

Dfs.replication

one

two

three

four

five

six

Mapred-site.xml

Mapreduce.framework.name

Yarn

one

two

three

four

five

six

Yarn-site.xml

Yarn.resourcemanager.hostname

Master

Yarn.nodemanager.aux-services

Mapreduce_shuffle

one

two

three

four

five

six

seven

eight

nine

ten

Note: don't forget to copy the hadoop on master to slave1 and slave2 after configuration, using the command: scp-r / usr/local/haoop slave1:/usr/local/

8 formatting namenode

Execute the following command

Hadoop namenode-format

Start-dfs.sh

Start-yarn.sh

9 start hadoop and test

Execute the following command test on master

Jps

If the result is as follows, the construction is successful.

ResourceManager

Jps

NameNode

NodeManager

SecondaryNameNode

Execute the following command test on slave1 and slave2

Jps

NodeManager

DataNode

Jps

This is the end of the article on "how to build Hadoop distributed clusters". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.