Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of Hadoop Architecture and pseudo-distributed installation

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail the example analysis of Hadoop architecture and pseudo-distributed installation. The editor thinks it is very practical, so I share it with you for reference. I hope you can get something after reading this article.

A brief introduction to Hadoop

Hadoop: a distributed system infrastructure suitable for big data's distributed storage and computing platform. Two core projects: HDFS and MapReduce

HDFS: distributed file system, mainly to solve distributed storage problems.

MapReduce: parallel computing framework, mainly to solve distributed computing problems.

The characteristics of Hadoop: high reliability, high scalability, high performance, high fault tolerance, low cost

Hadoop architecture:

In MapReduce, an application that is ready to submit becomes a job, and the unit of work that is divided by a job and runs on each computing node becomes a task.

The distributed file system (HDFS) provided by Hadoop is mainly responsible for data storage on each node to achieve high throughput data reading and writing.

Hadoop uses the Master/Slave architecture.

From a HDFS perspective (a file will be split into several default 64m block):

Primary node (only one): namenode. Accept user data, maintain the directory structure of the file system, manage the relationship between files and block, and between block and datanode.

Slave nodes (several): datanode. Store block, and there will be backup in order to ensure the data security.

From a MapReduce point of view:

Primary node (only one): JobTracker. Accept service tasks submitted by customers, assign tasks to TaskTracker execution, and monitor TaskTracker execution.

Slave nodes (there are many); TaskTracker. Perform the calculation tasks assigned by JobTracker.

2. Pseudo-distributed deployment Hadoop

Install the virtual machine (network is set to host-only)

Set static IP (so that the host is on the same network segment as the virtual machine)

Modify hostname, bind hostname, and IP

Modify the hostname: the configuration file is located in / etc/sysconfig/network

Bind the host to IP: the configuration file is located in / etc/hosts

Restart

Turn off the firewall and start automatically

View firewall status: service iptables status

Turn off the firewall: service iptables stop

View firewall runlevel: chkconfig | grep iptables

Turn off the firewall and start automatically: chkconfig iptables off

Configure SSH password-free login

Generate the key with the rsa encryption algorithm: ssh-keygen-t rsa (the resulting password is located in ~ / .shh)

Copy id_rsa.pub:cp id_rsa.pub authorized_keys

Authentication (login machine without password): ssh locahost

Install JDK

Copy JDK to the installation directory (I chose to install to / usr/local/jdk. Note that it is consistent with the settings in the JDK environment variable and Hadoop configuration)

Add execute permissions to the JDK installation file: chmod Ubunx jdk.bin

Extract:. / jdk.bin

Rename the installation directory: mv jdk. Jdk

Add environment variables: the configuration file is located at / etc/profile

Export JAVA_HOME=/usr/local/jdk

Export PATH=.:$JAVA_HOME/bin:$PATH

The change takes effect immediately: source / etc/profile

Authentication: java-version

Install Hadoop

Copy the hadoop installation package to the installation directory

Extract the hadoop installation package: tar-zxvf hadoop.tar.gz

Rename the installation directory: mv hadoop. Hadoop

Add environment variables: the configuration file is located at / etc/profile

Export HADOOP_HOME=/usr/local/hadoop

Export PATH=.:$HADOOP_HOME/bin:$. (JDK environment variable)

Modify Hadoop configuration file

The configuration file is located in: $HADOOP_HOME/config directory

Hadoop-env.sh (line 9 is uncommented and changed to): export JAVA_HOME=/usr/local/jdk/

Core-site.xml (see the end of the article for configuration)

Hdfs-site.xml (see the end of the article for configuration)

Mapred-site.xml (see the end of the article for configuration)

Format namenode, start Hadoop

Formatting: hadoop namenode-format

Start hadoop:start-all.sh

Verify and view JAVA processes: jps (6 processes should be displayed)

Visit: http://hadoop:50070

Visit: http://hadoop:50030

Several installation instructions:

Turn off the firewall under windows to avoid network access errors

Log in as root when logging in to Linux to avoid permission problems

After each step is set, it should be verified in time to avoid problems.

The configuration of the JDK,HADOOP environment variable is consistent with your own installation path

The hostname in the configuration file is consistent with your own hostname

Multiple formatting of namenode is prohibited. If it has been formatted repeatedly, empty the $HADOOP_HOME/tmp folder

The configuration file for HADOOP is as follows:

Core-site.xml (be careful to be consistent with your hostname)

Fs.default.name hdfs://hadoop:9000 change your own hostname hadoop.tmp.dir / usr/local/hadoop/tmp

Hdfs-site.xml

Dfs.replication 1 dfs.permissions false

Mapred-site.xml (be careful to be consistent with your hostname)

This is the end of mapred.job.tracker hadoop:9001 change your own hostname's article on "sample analysis of Hadoop architecture and pseudo-distributed installation". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report