Example Analysis of Linux stand-alone pseudo-distributed installation 04/28 Update SLTechnology News&Howtos

Example Analysis of Linux stand-alone pseudo-distributed installation

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail the example analysis of Linux stand-alone pseudo-distributed installation. The editor thinks it is very practical, so I share it with you for reference. I hope you can get something after reading this article.

1. Server information

One linux server [Linux 5.5 dint x861464, for installing hadoop]

2. Software required

Jdk1.6.0_31

Hadoop-2.2.0.tar.gz

3. Establish ssh login without password

The main reason is that you don't have to enter the password to log in to the Linux operating system multiple times every time you start hadoop.

(1) Log in to this machine without a password

$ssh-keygen-t rsa

If you enter directly, you will be prompted to enter the name of the file to store the password, enter id_rsa, and then generate two files in ~ / .ssh /: id_rsa and id_rsa.pub.

$ssh-keygen-t dsa

If you enter directly, you will be prompted to enter the name of the file to store the password, enter id_dsa, and then generate two files in ~ / .ssh /: id_dsa and id_dsa.pub.

$cat~/.ssh/id_rsa.pub > > ~ / .ssh/authorized_keys places the generated keychain on the keychain

$cat~/.ssh/id_dsa.pub > > ~ / .ssh/authorized_keys places the generated keychain on the keychain

$chmod 60000 ~ / .ssh/authorized_keys.

Then ssh localhost verifies whether it is successful or not. You are asked to type yes for the first time, but you don't need it later.

4. Install hadoop

Make sure that jdk is installed on the server where hadoop is to be installed

Assume that hadoop is installed in the / home/username/hadoop directory (username is the login user of the operating system), which is collectively referred to as the hadoop installation directory.

(1) decompress hadoop-2.2.0.tar.gz to the hadoop installation directory.

(2) configure system environment variables

You can modify the / etc/profile file by adding the following at the end of the file. The purpose of this article is to modify / home/username/.bash_profile. Please log in to ssh again after modification.

ExportHADOOP_PREFIX= "/ home/username/hadoop/hadoop-2.2.0"

PATH= "$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin"

Export HADOOP_MAPRED_HOME=$ {HADOOP_PREFIX}

Export HADOOP_COMMON_HOME=$ {HADOOP_PREFIX}

Export HADOOP_HDFS_HOME=$ {HADOOP_PREFIX}

Export HADOOP_YARN_HOME=$ {HADOOP_PREFIX}

Export HADOOP_CONF_DIR= "${HADOOP_PREFIX} / etc/hadoop"

Export YARN_CONF_DIR= "${HADOOP_PREFIX} / etc/hadoop"

Export HADOOP_COMMON_LIB_NATIVE_DIR=$ {HADOOP_PREFIX} / lib/native

ExportHADOOP_OPTS= "- Djava.library.path=$HADOOP_PREFIX/lib"

(3) configure hadoop

The following configuration files need to be modified:

Hadoop-env.sh

Modify JAVA_HOME, where the path of JAVA_HOME must be specified as the real path and cannot refer to ${JAVA_HOME}, otherwise there will be an error JAVA_HOME is not set at run time. The configuration is as follows:

Export JAVA_HOME=/usr/java/jdk1.6.0_26

Core-site.xml

Hdfs-site.xml

Where / home/username/hadoop/dfs/name,/home/username/hadoop/dfs/data is a directory in the file system and needs to be created first.

Mapred-site.xml

Yarn-site.xml

Note that the value of the yarn.nodemanager.aux-services attribute should be mapreduce_shuffle, not mapreduce.shuffle (the difference is "_" and "."), otherwise an error will occur.

5. Start hadoop

After completing the above configuration, you can detect whether the configuration is successful.

(1) format the file system HDFS of hadoop

Before starting hadoop, you need to format hadoop's file system HDFS. Go to the / home/username/hadoop/hadoop-2.2.0/bin folder and execute the following command to format:

$hdfs namenode-format

(2) start hadoop

After formatting the file system successfully, go to the / home/username/hadoop/hadoop-2.2.0/sbin directory and start hadoop. Execute the following command:

$start-all.sh-this command has been abandoned in the hadoop2.2.0 version.

Hadoop2.2.0 recommends the following command to start:

Start hdfs first:

$start-dfs.sh

$hadoop-daemon.sh startnamenode

$hadoop-daemon.sh startdatanode

Then start yarn daemons:

$start-yarn.sh

$yarn-daemon.sh startresourcemanager

$yarn-daemon.sh startnodemanager

After the startup is completed, visit the following address to check the dfs status. If successful, the corresponding interface will be displayed:

Web interface of http://x.x.x.x:50070/dfshealth.jsp HDFS

Datanode:

Http://x.x.x.x:50075/

Resourcemanager (JobTracker replacement):

Http://x.x.x.x:8088/cluster Web app / cluster

Nodemanager (TaskTrackerreplacement):

Http://x.x.x.x:8042/node Web app / node

This is the end of this article on "sample analysis of Linux stand-alone pseudo-distributed installation". I hope the above content can be helpful to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.