In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
1.hadoop:
Author: Doug Cutting
Inspired by three Google papers
two。 Version:
Apache: official version (1.1.2), learn to use
Cloudera: add functions to the apache version for commercial use
Yahoo: it is now focused on the version of apache
The core project of 3.hadoop
HDFS: (Hadoop Distributed File System) distributed file system
MapReduce: parallel computing framework
4.HDFS architecture (in the master-slave structure, the master node is responsible for management. Responsible for the operation from the node)
Master-slave structure (there is only one master node namenode, there can be many slave nodes datanodes)
Namenode is responsible for:
Receive the user's operation request
Maintain the directory structure of the file system
Manage the relationship between files and block, and between block and datanode
Datanode is responsible for:
Storage file
The files are divided into block and stored on disk
To ensure data security, there will be multiple copies of the file
Architecture of 5.MapReduce
Master-slave structure (there is only one master node JobTracker, many slave nodes TaskTrackers can be used)
JobTracker is responsible for:
Receive computing tasks submitted by customers
Assign computing tasks to TaskTracker for execution
Monitor the implementation of TaskTracker
TaskTrackers is responsible for:
Perform computing tasks assigned by JobTracker
Characteristics of 6.Hadoop:
Capacity expansion (Scalable): reliable storage and processing of gigabyte (PB) data
Low cost (Economical): data can be distributed and processed through a server farm of ordinary machines
Efficient: by distributing data, hadoop can process data in parallel on the node where the data is located
Reliable: hadoop can automatically maintain multiple copies of data and automatically redeploy computing tasks after task failure
Physical Distribution of 7.Hadoop Cluster
Description:
a. The Rack below represents two cabinets, each storing multiple servers, the left and right cabinets are connected with their own switches, and the left and right switches are connected to the total switch, so the servers on the cabinet can access each other.
b. The two master nodes on the cabinet each own a server, while the slave nodes are grouped together and stored on one server.
8. Single node physical structure
Description: the left and right pictures show the master node and the slave node, respectively. The master and slave nodes in the picture use the server of the linux system and run on the java virtual machine, because hadoop is developed based on java.
9.Hadoop deployment mode
Local deployment (rarely used)
Pseudo-distribution pattern (learning to use)
Cluster mode (used by companies)
10. Prepare the software before installation
VitualVox
Centos
Jdk-6u24-linux-xxx.bin
Hadoop-1.1.2.tar.gz
11. Pseudo-distribution mode installation steps: (6 steps)
Turn off the firewall
Modify ip
Modify hostname
Set up ssh automatic login
Install jdk
Install hadoop
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.