Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the five nodes of hadoop?

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains "what are the five nodes of hadoop". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what are the five nodes of hadoop"?

1.NameNode (Management Node)

Namenode manages the command space (Namespace) of the file system. It maintains the metadata (metadata) of the file system tree and all files and folders in the file tree, including editing logs (edits) and mirrored files. There are two files that manage this information, namely, the Namespace image file (fsimage) and the editing log file, which mainly records changes made to hdfs. The image file mainly records the file tree structure of hdfs. This information is stored in RAM by Cache, and of course, these two files are also persisted on the local hard disk. Namenode records the location of the data node in which each block is located in each file, but he does not persist this information because it is rebuilt from the data node when the system boots.

2.DataNode (work node)

Datanode is the working node of the file system, which stores and retrieves data according to the scheduling of the client or namenode, and periodically sends a list of blocks (block) they store to namenode. Without namenode, the file system cannot be used. In fact, if the server running the namenode service goes down, all files on the file system will be lost. Because we don't know how to rebuild the file based on the blocks of DataNode. All fault-tolerant redundancy mechanisms for NameNode are very important.

All the slave servers in the cluster run a DataNode daemon that reads and writes HDFS blocks to the local file system. When you need to read / write some data through the client, NameNode first tells the client which DataNode to do specific read / write operations, and then the client communicates directly with the daemon on the DataNode server and reads / writes related data blocks.

3.secondary NameNode (equivalent to a master-slave replication slave node in a MySQL database)

Secondary NameNode is a secondary daemon used to monitor the status of HDFS. Like NameNode, each cluster has a Secondary NameNode and is deployed on a separate server. Unlike NameNode, Secondary NameNode does not accept or record any real-time data changes, but it communicates with NameNode to keep snapshots of HDFS metadata on a regular basis. Because NameNode is a single point, the downtime and data loss of NameNode can be minimized through the snapshot feature of Secondary NameNode. At the same time, if there is a problem with NameNode, Secondary NameNode can be used as a backup NameNode in a timely manner.

4.ResourceManager

ResourceManage is resource management. In YARN, ResourceManager is responsible for the unified management and allocation of all resources in the cluster. It receives resource report information from each node (NodeManager) and allocates the information to each application (actually ApplicationManager) according to a certain strategy.

RM includes Scheduler (timing Scheduler) and ApplicationManager (Application Manager) Schedular, which are responsible for allocating resources to the application. It does not monitor and track the status of the application, and there is no guarantee that the application itself will be restarted or applications that fail due to hardware errors are not guaranteed. ApplicationManager is responsible for accepting new tasks, coordinating and providing restart in the event of ApplicationMaster container failure. The AM for each application is responsible for Scheduler requesting resources, as well as monitoring the usage and scheduling of these resources.

5.Nodemanager

NM is ResourceManager's agent on the slave machine, responsible for container management, monitoring their resource usage, and providing resource usage reports to ResourceManager/Scheduler.

Thank you for your reading, the above is the content of "what are the five nodes of hadoop". After the study of this article, I believe you have a deeper understanding of what the five nodes of hadoop are, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report