How to build Hadoop running environment 04/09 Update SLTechnology News&Howtos

How to build Hadoop running environment

2025-04-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly explains "how to build Hadoop operating environment". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to build Hadoop operating environment".

Hadoop is a distributed system infrastructure, which is widely used in the field of big data. It stores big data processing engine as close as possible. The core design of Hadoop is that HDFS and MapReduce,HDFS provide storage for massive data, and MapReduce provides computing for massive data.

We use the Linux operating system to build the environment. The following information is used to prepare the computer environment for building the Hadoop environment.

Hadoop@ubuntu:~$ cat / etc/os-release NAME= "Ubuntu" VERSION= "14.04.5 LTS, Trusty Tahr" ID=ubuntu ID_LIKE=debian PRETTY_NAME= "Ubuntu 14.04.5 LTS" VERSION_ID= "14.04" HOME_URL= "http://www.ubuntu.com/" SUPPORT_URL=" http://help.ubuntu.com/" BUG_REPORT_URL= "http://bugs.launchpad.net/ubuntu/"

Followed by a new user, this step can also be omitted, can be decided according to the actual situation, here is a new user named hadoop.

# create new user sudo useradd-m hadoop-s / bin/bash # set password sudo passwd hadoop # increase administrator privileges for hadoop user sudo adduser hadoop sudo # switch to hadoop user su hadoop

First of all, let's set SSH password-free login. This step is recommended. Since distributed system environments are made up of multiple servers, password-free login is easy to use.

# first check whether you can ssh to localhost ssh localhost without a password # if you cannot ssh to localhost without a password, execute the following command ssh-keygen-t rsa-P''- f ~ / .ssh/id_rsa cat ~ / .ssh/id_rsa.pub > ~ / .ssh/authorized_keys chmod 0600 ~ / .ssh/authorized_keys

Here are some preparations, and then it's time to officially deploy the Hadoop environment. We first download the latest stable version of the Hadoop distribution on the Apache official website (http://hadoop.apache.org)), then extract it to the specified directory and enter this directory, execute. / bin/hadoop and. / bin/hadoop version can display the usage document and version information of the hadoop script, respectively, and then modify the configuration files of. / etc/hadoop/core-site.xml and. / etc/hadoop/hdfs-site.xml The configuration modifications are shown below.

Modify the. / etc/hadoop/core-site.xml configuration file to add the following configuration:

Fs.defaultFS hdfs://localhost:9090 hadoop.tmp.dir file:/opt/bigdata/hadoop/tmp A base for other temporary directories.

Modify the. / etc/hadoop/hdfs-site.xml configuration file to add the following configuration:

Dfs.replication 1 dfs.namenode.name.dir file:/opt/bigdata/hadoop/tmp/dfs/name dfs.datanode.data.dir file:/opt/bigdata/hadoop/tmp/dfs/data

After the configuration is modified, format the file system as follows.

Hadoop@ubuntu:/opt/bigdata/hadoop$. / bin/hdfs namenode-format # output will be more after execution, see the following message to indicate success INFO common.Storage: Storage directory / opt/bigdata/hadoop/tmp/dfs/name has been successfully formatted.

You may encounter the following two problems when formatting NameNode.

Prompt Error: JAVA_HOME is not set and could not be found. , which means that the JAVA_HOME environment variable is not configured properly, reconfigure it, or modify the. / etc/hadoop/hadoop-env.sh file to change export JAVA_HOME=$ {JAVA_HOME} directly to the absolute directory export JAVA_HOME=/usr/lib/jvm/java-8.

This indicates the error of ERROR namenode.NameNode: java.io.IOException: Cannot create directory / opt/bigdata/hadoop/tmp/dfs/name/current. This is because there is a problem with the write permission of the configured / opt/bigdata/hadoop/tmp directory, which can be solved by directly executing sudo chmod-R aqiw / home/hadoop/tmp.

Next, execute. / sbin/start-dfs.sh to open the NameNode and DataNode daemons, and then check that NameNode, DataNode, and SecondaryNameNode have all started successfully, as follows:

Hadoop@ubuntu:/opt/bigdata/hadoop$ jps 4950 Jps 3622 SecondaryNameNode 3295 DataNode 2910 NameNode

After successful startup, you can use a browser to open the Web interface of http://localhost:50070/ browsing NameNode.

So far, the Hadoop single-node cluster (pseudo-distributed) environment has been successfully built, and then a pseudo-distributed instance of Hadoop is run. Because Hadoop stand-alone mode is the running local file system, (pseudo) distributed mode is the data on the running HDFS. We now create a user directory in HDFS, execute the destination command. / bin/hdfs dfs-mkdir-p / user/hadoop, and execute the following command to copy the input file to the distributed file system.

# this may not be performed, because the directory #. / bin/hdfs dfs-mkdir input. / bin/hdfs dfs-put etc/hadoop input # will be created automatically to view the list of files copied to HDFS. / bin/hdfs dfs-ls input

Next, run a mapreduce instance that comes with Hadoop to see the effect, and execute the following command directly.

. / bin/hadoop jar. / share/hadoop/mapreduce/hadoop-mapreduce-examples- 2.9.1.jar grep input output 'dfs [a Murray z.] +'

Use the. / bin/hdfs dfs-cat output/* command to view a list of output files on the HDFS after running, or use the following command to copy to the local view.

. / bin/hdfs dfs-get output output. / cat output/*

Turn off Hadoop and use the. / sbin/stop-dfs.sh command directly.

Thank you for your reading, the above is the content of "how to build Hadoop operating environment", after the study of this article, I believe you have a deeper understanding of how to build Hadoop operating environment, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.