What is the operation method of stand-alone and pseudo-distribution mode in Hadoop0.20.0 deployment and testing 04/22 Update SLTechnology News&Howtos

What is the operation method of stand-alone and pseudo-distribution mode in Hadoop0.20.0 deployment and testing

2025-04-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article will explain in detail what is the operation method of stand-alone and pseudo-distribution mode in Hadoop0.20.0 deployment and testing. The content of the article is of high quality, so the editor will share it for you as a reference. I hope you will have some understanding of the relevant knowledge after reading this article.

1. Stand-alone mode Local (Standalone) Mode

By default, Hadoop is configured to run as a stand-alone process in non-split Java layout mode. This is very helpful for debugging.

After the above configuration, you can run stand-alone mode. Please refer to the manual for specific operation.

two。 The operation method of pseudo-distributed mode

In a Hadoop0.20.0 deployment, Hadoop can run on a single node in a so-called pseudo-distributed mode, where each Hadoop daemon runs as a separate Java process.

Version 0.20 has a greater change than the previous version than the configuration file, from the original configuration in hadoop-site.xml to the configuration in the following three files:

Conf/core-site.xml,conf/hdfs-site.xml,conf/mapred-site.xml

Specific configuration View Manual

Take conf/core-site.xml as an example:

Fs.default.name hdfs://localhost:9000

If a connection error occurs, you can try replacing localhost with native IP or 127.0.0.1

Password-free ssh setting in Hadoop0.20.0 deployment

Now confirm whether you can log in to localhost with ssh without entering a password:

$sshlocalhost

If you cannot log in to localhost with ssh without entering a password, execute the following command:

$ssh-keygen-tdsa-P''-f~/.ssh/id_dsa

$cat~/.ssh/id_dsa.pub > > ~ / .ssh/authorized_keys

Execution

Format a new distributed file system:

$bin/hadoopnamenode-format

Start the Hadoop daemon:

$bin/start-all.sh

The log of the Hadoop daemon is written to the ${HADOOP_LOG_DIR} directory (default is ${HADOOP_HOME} / logs).

Browse the network interfaces of NameNode and JobTracker, and their addresses are:

* NameNode- http://localhost:50070/

* JobTracker- http://localhost:50030/

Up to this point, you must first access the above NameNode network interface, and only when the page can display the status of HDFS properly can you proceed with the following steps.

1. Clicking Browsethefilesystem,*** to enter may return a 404 error page.

two。 Go back to the previous page and refresh the page, which should appear as shown in the following figure, which indicates that DFS is working properly (that is, values such as DFSUsed are no longer displayed as 0).

3. If not, repeat step 1 and 2 until you succeed.

Copy the input file to the distributed file system: $bin/hadoopfs-putconfinput

Run the sample program provided by the distribution: $bin/hadoopjarhadoop-*-examples.jargrepinputoutput'dfs [a Murz.] +'

View the output file:

Copy the output file from the distributed file system to the local file system to view:

$bin/hadoopfs-getoutputoutput

$catoutput/*

$bin/hadoopfs-getoutputoutput

$catoutput/*

View the output file on the distributed file system:

$bin/hadoopfs-catoutput/* $bin/hadoopfs-catoutput/* 3dfs.class 2dfs.period 1dfs.file 1dfs.replication 1dfs.servers 1dfsadmin 1dfsmetrics.log 3dfs.class 2dfs.period 1dfs.file 1dfs.replication 1dfs.servers 1dfsadmin 1dfsmetrics.log

When all operations are complete, stop the daemon: $bin/stop-all.sh

Summary of commands (Command) in Hadoop0.20.0 deployment and testing

This part of the content can actually be understood through the command Help and introduction, I mainly focus on introducing a few commands that I use more. The Hadoopdfs command is followed by a parameter for the operation of HDFS, which is similar to the command of the Linux operating system, for example:

Hadoopdfs-ls is to view the contents of the / usr/root directory. If the path is left empty by default, this is the current user path.

Hadoopdfs-rmrxxx is to delete a directory. If you execute it multiple times, you can use this command to delete the existing folder contents before each execution.

The command Hadoopdfsadmin-report can view the situation of DataNode globally.

The parameter added after Hadoopjob is the operation of the currently running Job, such as list,kill, etc.

Hadoopbalancer is the command mentioned earlier to balance the disk load.

On the Hadoop0.20.0 deployment and testing of stand-alone and pseudo-distribution mode operation methods are shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.