In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article will explain in detail the example analysis of Hadoop architecture and pseudo-distributed installation. The editor thinks it is very practical, so I share it with you for reference. I hope you can get something after reading this article.
A brief introduction to Hadoop
Hadoop: a distributed system infrastructure suitable for big data's distributed storage and computing platform. Two core projects: HDFS and MapReduce
HDFS: distributed file system, mainly to solve distributed storage problems.
MapReduce: parallel computing framework, mainly to solve distributed computing problems.
The characteristics of Hadoop: high reliability, high scalability, high performance, high fault tolerance, low cost
Hadoop architecture:
In MapReduce, an application that is ready to submit becomes a job, and the unit of work that is divided by a job and runs on each computing node becomes a task.
The distributed file system (HDFS) provided by Hadoop is mainly responsible for data storage on each node to achieve high throughput data reading and writing.
Hadoop uses the Master/Slave architecture.
From a HDFS perspective (a file will be split into several default 64m block):
Primary node (only one): namenode. Accept user data, maintain the directory structure of the file system, manage the relationship between files and block, and between block and datanode.
Slave nodes (several): datanode. Store block, and there will be backup in order to ensure the data security.
From a MapReduce point of view:
Primary node (only one): JobTracker. Accept service tasks submitted by customers, assign tasks to TaskTracker execution, and monitor TaskTracker execution.
Slave nodes (there are many); TaskTracker. Perform the calculation tasks assigned by JobTracker.
2. Pseudo-distributed deployment Hadoop
Install the virtual machine (network is set to host-only)
Set static IP (so that the host is on the same network segment as the virtual machine)
Modify hostname, bind hostname, and IP
Modify the hostname: the configuration file is located in / etc/sysconfig/network
Bind the host to IP: the configuration file is located in / etc/hosts
Restart
Turn off the firewall and start automatically
View firewall status: service iptables status
Turn off the firewall: service iptables stop
View firewall runlevel: chkconfig | grep iptables
Turn off the firewall and start automatically: chkconfig iptables off
Configure SSH password-free login
Generate the key with the rsa encryption algorithm: ssh-keygen-t rsa (the resulting password is located in ~ / .shh)
Copy id_rsa.pub:cp id_rsa.pub authorized_keys
Authentication (login machine without password): ssh locahost
Install JDK
Copy JDK to the installation directory (I chose to install to / usr/local/jdk. Note that it is consistent with the settings in the JDK environment variable and Hadoop configuration)
Add execute permissions to the JDK installation file: chmod Ubunx jdk.bin
Extract:. / jdk.bin
Rename the installation directory: mv jdk. Jdk
Add environment variables: the configuration file is located at / etc/profile
Export JAVA_HOME=/usr/local/jdk
Export PATH=.:$JAVA_HOME/bin:$PATH
The change takes effect immediately: source / etc/profile
Authentication: java-version
Install Hadoop
Copy the hadoop installation package to the installation directory
Extract the hadoop installation package: tar-zxvf hadoop.tar.gz
Rename the installation directory: mv hadoop. Hadoop
Add environment variables: the configuration file is located at / etc/profile
Export HADOOP_HOME=/usr/local/hadoop
Export PATH=.:$HADOOP_HOME/bin:$. (JDK environment variable)
Modify Hadoop configuration file
The configuration file is located in: $HADOOP_HOME/config directory
Hadoop-env.sh (line 9 is uncommented and changed to): export JAVA_HOME=/usr/local/jdk/
Core-site.xml (see the end of the article for configuration)
Hdfs-site.xml (see the end of the article for configuration)
Mapred-site.xml (see the end of the article for configuration)
Format namenode, start Hadoop
Formatting: hadoop namenode-format
Start hadoop:start-all.sh
Verify and view JAVA processes: jps (6 processes should be displayed)
Visit: http://hadoop:50070
Visit: http://hadoop:50030
Several installation instructions:
Turn off the firewall under windows to avoid network access errors
Log in as root when logging in to Linux to avoid permission problems
After each step is set, it should be verified in time to avoid problems.
The configuration of the JDK,HADOOP environment variable is consistent with your own installation path
The hostname in the configuration file is consistent with your own hostname
Multiple formatting of namenode is prohibited. If it has been formatted repeatedly, empty the $HADOOP_HOME/tmp folder
The configuration file for HADOOP is as follows:
Core-site.xml (be careful to be consistent with your hostname)
Fs.default.name hdfs://hadoop:9000 change your own hostname hadoop.tmp.dir / usr/local/hadoop/tmp
Hdfs-site.xml
Dfs.replication 1 dfs.permissions false
Mapred-site.xml (be careful to be consistent with your hostname)
This is the end of mapred.job.tracker hadoop:9001 change your own hostname's article on "sample analysis of Hadoop architecture and pseudo-distributed installation". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.