Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the architectures and their respective roles in HDFS?

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

Xiaobian to share with you the HDFS architecture and their respective roles, I hope you have some gains after reading this article, let's discuss it together!

HDFS uses a Master/Slave architecture to store data, which consists of four components: HDFS Client, NameNode, DataNode, and Secondary NameNode.

Client: The client.

1. File segmentation. When a file is uploaded to HDFS, Client divides the file into blocks and stores them.

Interact with NameNode to obtain file location information.

Interact with DataNode to read or write data.

Client provides commands to manage HDFS, such as enabling or disabling HDFS.

Client can access HDFS through several commands.

NameNode: It is a master, it is a supervisor, manager.

Manage HDFS namespaces.

2. Management Block Mapping Information

3. Configure replica policy

4. Handle client read and write requests.

DataNode: Slave. NameNode issues the command, DataNode performs the actual operation.

1. Store actual data blocks.

2. Perform read/write operations on data blocks.

Secondary NameNode: hot standby that is not a NameNode. When a NameNode dies, it does not immediately replace the NameNode and provide services.

1. Auxiliary NameNode, sharing its workload.

Fsimage and fsedates are merged regularly and pushed to NameNode. (NameNode appends changes to the filesystem to a log file (edits) on the local filesystem.) When a NameNode starts, it first reads the HDFS state from an image file (fsimage) and then applies edits to the log file. It then writes the new HDFS state to (fsimage) and starts normal operation with an empty edits file. Because NameNode merges fsimage and edits only at startup, log files can become very large over time, especially for large clusters. Another side effect of log files being too large is that the next NameNode startup can take a long time.

Secondary NameNode periodically merges fsimage and edits logs, keeping the edits log file size to a limit. Because memory requirements and NameNodes are on the same order of magnitude, it is common for secondary NameNodes (which run on separate physical machines) and NameNodes to run on different machines. Secondary NameNode is started on the node specified in conf/masters via bin/start-dfs.sh.

Secondary NameNode The directory where the latest checkpoint is stored has the same directory structure as NameNode. So NameNode can read checkpoint mirrors on Secondary NameNode when needed)

3. In case of emergency, it can assist in restoring NameNode.

After reading this article, I believe you have a certain understanding of "HDFS architecture and their respective roles." If you want to know more about relevant knowledge, please pay attention to the industry information channel. Thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report