Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the basic structure of Hadoop HDFS

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what is the basic structure of Hadoop HDFS". Friends who are interested may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn what the basic structure of Hadoop HDFS is.

Basic structure of 1.HDFS

Distributed file system, high throughput, large latency, the main structure includes namenode and datanode nodes.

Question:

1. Namenode single point of failure

2. Datenode synchronizes replicas through protocol

3. In order to solve the problem of namenode single point failure, add standby nodes, how to synchronize? In fact, Secondary Namenode cannot be used as a Namenode. Its main function is to periodically merge the Namespace image with the operation log file (edit log) to prevent the operation log file (edit log) from becoming too large. Typically, Secondary Namenode runs on a separate physical machine because the merge operation takes a lot of CPU time and memory equivalent to Namenode. The secondary Namenode holds a backup of the merged Namespace image, which can be used in case Namenode goes down one day. Although it is not exactly a backup of namenode, rather a secondary node) periodically merges the named control image file and modification log of the metadata node

4. How to do switchover? Rely on zookeeper, on the one hand responsible for activity point selection, on the other hand responsible for error checking. It can also be used as a distributed lock, as you'll learn about zookeeper later.

Answer:

Namenode has two solutions to solve a single point of failure, one is to use the secondary namenode provided by itself, but there is a delay, just backup, which will cause data loss; the second method is to synchronize and atomically write to the local hard disk, but also to a NFS server. (the probability that the NFS server will hang up is not considered for the time being)

2. HBASE

3. ZooKeeper

The core of Zookeeper is a streamlined file system whose primitive operation is a rich set of building block that can be used to implement many coordinated data structures and protocols, including distributed queues, distributed locks, and "leader election" (leader election) in a set of peer nodes.

Zookeeper implements the Paxos algorithm. After the Zookeeper cluster is started, the leader selection is automatically performed, and one machine is voted as the Leader, and the rest are Follower. Through the mechanism of heartbeat, Follower takes commands or messages from Leader, synchronizes its own data, and is consistent with Leader. In order to ensure data consistency, the data update is considered successful only after the status of more than half of the Follower is synchronized with the Leader successfully. For election convenience, the number of Zookeeper clusters is odd.

At this point, I believe you have a deeper understanding of "what is the basic structure of Hadoop HDFS". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report