How to build hadoop Cluster in Linux system 07/06 Update SLTechnology News&Howtos

How to build hadoop Cluster in Linux system

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article analyzes "how to build a hadoop cluster in Linux system". The content is detailed and easy to understand. Friends who are interested in "how to build a hadoop cluster in the Linux system" can follow the editor's train of thought to read it in depth. I hope it will be helpful to everyone after reading. Let's follow the editor to learn more about "how to build a hadoop cluster in Linux system".

Hadoop is a software platform that makes it easier to develop and run software that handles large-scale data. The platform is implemented by object-oriented programming language Java and has good portability.

About HADOOP: Hadoop, a distributed system infrastructure developed by the Apache Foundation. Users can develop distributed programs without knowing the underlying details of the distribution. Make full use of the power of the cluster for high-speed computing and storage.

In a nutshell, Hadoop is a software platform that makes it easier to develop and run software that handles large-scale data. The platform is implemented by object-oriented programming language Java and has good portability.

Linux system installation HADOOP step: 1 try to create a new user named hadoop and set the password (while adding administrator privileges):

But we encountered a problem at the beginning: because the Hadoop and jdk we downloaded were downloaded under Windows, when we shared the files into ubuntu, we could only share them to the first created user (I don't know why), and the network under ubuntu was very slow, so I didn't create a new user and installed it on the original user (later, it was proved to be OK by running successfully).

two。 Update apt and download vim

3. Install SSH, configure SSH login without password: (prompt (SSH login prompt for the first time), enter yes. Then press the prompt to enter the password hadoop, so you can log in to the local computer.)

4. Install jdk (from the Linux command line interface, execute the following Shell command):

Decompression process:

After the JDK file is unzipped, you can execute the following command to check the / usr/lib/jvm directory:

Continue with the following command to set the environment variable:

The following command opens the user's environment variable configuration file hadoop using the vim editor, and adds the following lines at the beginning of the file (note your jdk version number):

5. Install hadoop3.2.1 (Hadoop can be used after decompression. Enter the following command to check whether Hadoop is available, and if successful, the Hadoop version information will be displayed:

Overall command:

6.Hadoop stand-alone configuration (non-distributed): (Hadoop default mode is non-distributed mode (local mode), which can be run without other configuration) running example:

Here we choose to run the grep example. We take all the files in the input folder as input, filter the words that conform to the regular expression dfs [Amurz.] + and count the number of occurrences, and finally output the results to the output folder.

The final results are consistent with the tutorial:

Delete. / output

7. Hadoop pseudo-distributed configuration modifies the configuration file core-site.xml (it is more convenient to edit through gedit: gedit. / etc/hadoop/core-site.xml)

Modified to:

Similarly, modify the configuration file hdfs-site.xml:

After the configuration is complete, perform the formatting of NameNode:

Open the NameNode and DataNode daemons:

Fortunately, there are no errors in the tutorial:

Open port 9870 in the web page:

Come here to show that it has been successful!

Fourth, the problems encountered in the experiment and solutions, summarize problem one: file transfer between Ubuntu and Windows, drag and drop, sharing pasteboard. Solution: install the enhanced features of VBox, and set up the following places (own Baidu):

Problem 2: under the new user, the file cannot be dragged (not even the administrator), so it is not installed under the new user hadoop, but it is still successful in the end.

On the Linux system how to build hadoop cluster to share here, I hope that the above content can make you improve. If you want to learn more knowledge, please pay more attention to the editor's updates. Thank you for following the website!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.